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; ACQ gene 

; i 

The present invention relates to a gene useful in a process to increase the microbial pro- 
duction of carotenoids. •: ; 

> f 

The carotenoid astaxanthin is distributed ih a wide variety of organisms such as animals, 
algae and microorganisms. It has a strong ^ntioxidation property against reactive oxygen 
species. Astaxanthin is used as a coloring reagent, especially in the industry of farmed fish, 
such as salmon, because astaxanthin imparts distinctive orange-r? d coloration to the ani- 
raals and contributes to consumer appeal iri the marketplace. 

One of the first steps in the carotenogehic pathway of, e.g. Phaffid^ rhodozyma, is the con- 
densation 6( two molecules of acetyl-CoA. Acetyl-CoA is also the ^substrate for acetyl-CoA 
carboxylase, one of the enzymes involved iri fatty acid biosynthesis. 

* i i 

In one aspect, the present invention provides a novel DNA fragment comprising a gene 

encoding the eniyme acetyl-CoA carboxylase. \ 

• i I 

More particularly, the present invention provides a DNA containing regulatory regions* 
such as promoter and terminator, as well as;the open reading frame of acetyl-CoA carb- 



oxylase gene. 



! 



The present invention provides a DNAfragiient encoding acetyl-jCoA carboxylase in P. 
rhodozyma. The said DNA means a cDNA which contains only open reading frame 
flanked between die short fragments in.tts 5> and 3 - untranslated region, and a genomic 

j v 

DNA which also contains its regulatory sequences such as its promoter and terminator 
which are necessary for the expression of the acetyl-CoA carboxylase gene in P. rhodozyma. 
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Accordingly, the present invention relates a polynucleotide comprising a nucleic acid 
molecule selected from the group consisting of: ! 

(a) nucleic acid molecules encoding at least the mature form of the polypeptide depicted in 
SEQIDNO:3; : j 

5 (b) nucleic acid molecules comprising; the coding sequence as depicted in SEQ ID N02; 

(c) nucleic acid molecules whose nucleotide sequence is degenerate as a result of the 
genetic code to a nucleotide sequence of (a) or (b); j 

(d) nucleic acid molecules encoding a polypeptide derived from &e polypeptide encoded 

by a polynucleotide of (a) to (c) byway of Substitution, deletion ind/or addition of one or 

) several amino adds of the amino acid sequence of the polypeptide encoded by a poly- 
nucleotide of (a) to (c); ! 

(e) nucleic acid molecules encoding a polypeptide derived from tjie polypeptide whose 
sequence has an identity of 563 % or nWto the amino add sequence of the polypeptide 
encoded by a nuddc add molecule of (a) or (b); j 

(f) nudeic add molecules comprising a fragment or a epitope-beiiring portion of a poly- 
peptide encoded by a nudeic add mo!ecule:of any one of (a) to (*) and having acetyl-CoA 
carboxylase activity; ; | 

(g) nudeic acid molecules comprising a polynucleotide having a Sequence of a nuddc add 
molecule amplified from Phaffia or Xahthtyhylomyces nudeic add library using the 
primers depicted in SEQ ID NO:4, 5, and 6; ? 

(h) nudeic add molecules encoding a polypeptide having acetyl-£oA carboxylase activity, 
wherein said polypeptide is a fragmentof a polypeptide encoded by any one of (a) to (g); 

(i) nuddc add molecules comprising at least J 5 nudeotides of a p>lynudeotide of any one 
of(a)to(d)j ; , S 

0) nudeic add molecules encoding a polypeptide having acetyl-QoA carboxylase activity, 
wherein said polypeptide is recogmWby antibodies that have be^ raised against a poly- 
peptide encoded by a nuddc add molecule bf any one of (a) to (hj; 
(k) nuddc add molecules obtainable by screening an appropriat J library under stringent 
conditions with a probe having the sequence of the nuddc add molecule of any one of (a) 
to (j), and encoding a polypeptide having ah acetyl-CoA carboxylase activity 
(1) nudeic add molecules whose complementary strand hybridizes under stringent condi- 
tions with a nuddc add molecule of any onfe of (a) to (k), and encoding a polypeptide 
having acetyl-CoA carboxylase activity^ ! ( 

i j 
The terms "gene(s)", "polynudeotide", "nudeic add sequence", "nucleotide sequence", 
"DNA sequence" or "nucleic add molecWe(s)" as used herein refers to a polymeric form of 
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nucleotides of any length, either ribonucleotides or deoxyribonu£lcotidcs. This term refers 
only to the. primary structure of the molecule. \ 

Thus, this term includes double- and single- stranded DNA, and IrNA. It also includes 
known types of modifications, for example, 1 methylation, "caps" substitution of one or 
5 more of the naturally occurring nucleotides with an analog. Preferably, the DNA sequence 
. of the invention comprises a coding sequence encoding the abov$-defined polypeptide. 

A "coding sequence" is a nucleotide sequence which is transcribed into raRNA and/or 
translated into a polypeptide when placed under the control of appropriate regulatory 
sequences. 'The boundaries of the coding sequence are determined by a translation start 
10 codon at the S'-terminus and a translation &op codon at the 3'-tejrminus. A coding 
sequence can include, but is not limited to mRNA, cDNA, recombinant nucleotide 
sequences or genomic DNA, while introns may be present as well junder certain circum- 
stances. SEQ ID:1 depicts the genomic DN^. in which the intron Sequence is inserted in 
the coding sequence for acetyl-CoA carboxylase gene from P. rhofyozytna. 

15 In general, the gene consists of several parts: which have different functions from each 
other. In eukaryotes, genes which encode a^corresponding proteip, are transcribed to pre- 
mature messenger RNA (pre-mRNA) differing from the genes for' ribosomal RNA (rRNA), 
small nuclear RNA (snRNA) and transfer R|>JA (tRNA). Although RNA polymerase II 

(PolII) plays a central role in this transcription event, PolII can n<>t solely start transcrip- 

f > 

20 tion without cis element covering an upstream region containing a promoter and an up- 

? \ 
stream activation sequence (OAS), and? a ftww-acting protein factor. At first, a transcrip- 

tion initiation complex which consists of several basic protein components recognize the 

promoter sequence in the 5'-adjacent regiori of the gene to be expressed. In this event, 

some additional participants are required in the case of the gene Which is expressed under 

25 some specific regulation, such as a heat shock response, or adaptation to a nutrition 
starvation, and so on. In such a case, a.UAS is required to exist in?the 5'-untranslated 
upstream region around the promoter sequence, and some positive or negative regulator 
proteins recognize and bind to the UA& Thte strength of the binding of transcription 
initiation complex to the promoter sequence is affected by such a binding of the trans- 

30 acting factor around the promoter, and this fenables the regulation of transcription activity. 

* * \ 

After the activation of a transcription initiation complex by the phosphorylation, a tran- 
scription initiation complex initiates transcription from the transcription start site. Some 
parts of the transcription initiation complex are detached as an elongation complex from 
the promoter region to the 3' direction of the gene (this step is catfed as a promoter 
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clearance event) and the elongation complex continues the transcription until it reaches to 
a termination sequence that is located in the 3'-adjacent downstrfeam region of the gene. 
Pre-mRNA thus generated is modified in nucleus by the additioij of cap structure at the 
cap site which almost corresponds to die transcription start site, and by the addition of 
polyA stretches at the poIyA signal which is located at the S'-adjacent downstream region. 
Next, intron structures are removed frbm the coding region and fxon parts are combined 
to yield an open reading frame whose sequence corresponds to tHe primary amino acid 
sequence of a corresponding protein. This modification in which a mature mRNA is 
generated is necessary for a stable gen* expression. cDNA in general terms corresponds to 
the DNA sequence which is reverse-transcribed from this maturejmRNA sequence* It can 
be synthesized by the reverse transcriptase derived from viral speiies by using a mature 
mRNA as a template, experimentally i ; ( • 

To express a gene which was derived ftom ^ukaryote, a procedure in which cDNA is 
cloned into an expression vector for £ : colt is often used. This resets from the fact that a 
specificity of intron structure varies among the organisms and an* inability to recognize the 
intron sequence from other species. In fact; prokaryote has no intron structure in its own 
genetic background- Even in yeast, thej genetic background is diflferent between Asco- 
ntycetes to which Saccharomyces cereviswe belongs and Basidiomycetes to which P. rhodo- 

r \ 

zyma belongs, e.g. the intron structure ;of the actin gene from P. rhodozyma cannot be re- 

cognized nor spliced by the ascomycetdus yeast, 5- cerevisiae. i 

[ ; ; , 

Intron structures of some lands of the jjenes appear to be involved in the regulation of the 
expression of their genes. It might be important to use a genomiq fragment which has its 
introns in a case of self-cloning of the gene of a interest whose intron structure involves 

such a regulation of its own gene expression. 

• * j 

• i * 

To apply a genetic engineering method for 4 strain improvement fctudy, it is necessary to 

study its genetic mechanism in the event such as transcription and translation. It is 
important to determine a genetic sequence $uch as its UAS, prompter, intron structure and 

terminator to study the genetic mechanism.* ! 

i ! : 

According to this invention, the gene encoding the acetyi-CoA carboxylase (ACQ gene 
from P. rhodozyma including its 5'- and 3'-adjacent regions as we(l as its intron structure 

• * r 

was determined. j 

i | i \ 

The invention further encompasses polynucleotides that differ from one of the nucleotide 
sequences shown in SEQ ID NO:2 (and portions thereof) due to degeneracy of the genetic 
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code and also encode an acetyi-CoA carboxylase as that encoded *by the nucleotide sequen- 
ces shown in SEQ ID NO:2. Further tHe polynucleotide of the invention has a nucleotide 
sequence encoding a protein having ah amino acid sequence shown in SEQ ID NO;3. In a 
still further embodiment, the polynucleotide of the invention encodes a full length P. 
5 rhodozyma protein which is substantially hpmologous to an amirjo acid sequence of SEQ 
IDNO:3. : \ \ 

In addition it will be appreciated by those skilled in the art that I^NA sequence polymor- 
phism that lead to changes in the amino add sequences may exist within a population 
(e.g., the P. rhodozyma population). Such genetic polymorphism in the acetyi-CoA carb- 

10 oxylase gene may exist among individuals within a population dde to natural variation. 

f ? 

As used herein, the terms "gene" and "recombinant gene 11 refer to Wcleic acid molecules 
comprising an open reading frame encoding an acetyl-CoA carboxylase, preferably an 
acetyl-CoA carboxylase from P. rhodozyma^ \ 

\ \ \ 
Such natural variations can typically result in 1-5 % variance in die nucleotide sequence of 

15 the acetyl-CoA carboxylase gene. Any and all such nucleotide variations and resulting 
amino add polymorphism in acetyl-CtfA carboxylase that are thejresult of natural varia- 
tion and that do not alter the functional activity of acetyl-CoA carboxylase are intended to 
be within the scope of the invention. \ \ ; ' 

Polynucleotides corresponding to natural variants and non-P. rhodozyma homologues of 
20 the acetyl-CoA carboxylase cDNA of the invention can be isolate4 based on their homo- 
logy to P. rhodozyma acetyl-CoA carboxylase polynucleotides disclosed herein using the 
polynucleotide of the invention, or a portion thereof, as a hybridisation probe according 
to standard hybridization techniques under stringent hybridization conditions. 
Accordingly, in another embodiment, a polynucleotide of the invention is at least 15 
25 nucleotides in length. Preferably it hybridizes under stringent conditions to the nucleic 
acid molecule comprising a nucleotide ^sequence of the polynucleotide of the present 
invefition, e.g. SEQ ID NO:2. In other 'embodiments, the nucleic kcid is at least 20, 30, 50, 
100, 250 or moire nucleotides in length.; The term "hybridizes under stringent conditions 11 
is defined above and is intended to describe .conditions for hybridization and washing 
30 under which nucleotide sequences at least 60% identical to each other typically remain 
hybridized to each other* Preferably, the conditions are such that sequences at least about 
65% or 70%, more preferably at least about 75% or 80%, and eveij more preferably at least 
about 85%, 90% or 95% or more identical to each other typically Remain hybridized to 
each other. Preferably, polynucleotide of the invention that hybridizes under stringent 
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conditions to a sequence of SEQ ID NO:2 corresponds to a naturally occurring nucleic 
acid molecule. \ 

j ; i 

In the present invention, the polynucleotide sequence includes SEQ ID NO:2 and frag- 
ments thereof having polynucleotide sequences which hybridize to SEQ ID NO:2 under 
3 stringent conditions which are sufficient to.'identify specific binding to SEQ ED NO:2. For 
example, any combination of the followingihybridization and wafch conditions maybe 
used to achieve the required specific binding: J 

High Stringent Hybridization: 6X SSQ 0.5& SDS, 100 Mg/ml denatured salmon sperm 

DNA, 50% formamide, incubate overnight with gende rocking at 42°C. 

) High Stringent Wash: 1 wash in 2X SSC, oi% SDS at room temperature for 15 minutes, 

followed by another wash in 0.1X SSQ 0-5^ SDS at room temperature for 15 minutes. 

Low Stringent Hybridization: 6X SSQ l p.5% SDS, 100 ug /ml denatured salmon sperm 

DNA, 50% formamide, incubate overnight with gentle rocking ati37°C. 

Low Stringent Wash: 1 wash in 0. IX S$C, 0i5% SDS at room temperature for 15 minutes 

i 1 ) • 

Moderately stringent conditions may be obtained by varying the temperature at which me 

hybridization reaction occurs and/or the wqsh conditions as set fSrth above. In the 

present invention, it is preferred to use* high stringent hybridization and wash conditions 

to define the antisense activity against acetyl-CoA carboxylase geiie from P. rhodozyma. 

The term "homology;' means that the respective nucleic add molecules, or encoded pro- 
teins are functionally and/or structurally equivalent The nucleic acid molecules that are 
homologous to the nucleic acid molecules described above and that are derivatives of said 
nucleic acid molecules are, for example^ variations of said nucleicjadd molecules which re- 
present modifications having the samejbioldgical function, in particular encoding proteins 
with the same or substantially the same biological function. Theyjmay be naturally occur- 
ing variations, such as sequences from othe* plant varieties or spedes, or mutations. These 
mutations may occur naturally or may be obtained by mutagenesis techniques. The allelic 
variations may be naturally occurring allelic variants as well as syijthetically produced or 
genetically engineered variants. Structural Equivalents can, for example, be identified by 
testing the binding of said polypeptide? to antibodies. Structural equivalents have simflar 
immunological dwaderistics, e.g. conipris^ similar epitopes. \ 

! ^ • 

j .* ; 
As used herein, a "naturaMy-occurring'^nudeic add molecule refers to an RNA or DNA 
molecule having a nudeotide sequencejthat Occurs in nature (e.g.,| encodes a natural pro- 
tein). Preferably, the polynudeotide encodes a natural P. rhodozyfra acetyl-CoA carboxyl- 
ase. , • < 
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acid or nucleic acid "homology" is equivalent to amino acid or nucleic acid "identity"). 
. The percent homology between the two sequences is a function of the number of identical 
positions shared by the sequences (i.e;, % homology = numbers of identical positions/total 
numbers of positions x 100). The homology can be determined by computer programs as 
5 Blast 2.0 [Altschul, Nuc Add. Res., 25:3389-3402 (1997)] . In this invention, GENETYX- 
SV/RC software (Software Development Co., Ltd., Tokyo, Japan } is used by using its de- 
fault algorithm as such homology analysis Software. This software uses the Lipman-Pear- 
son method for its analytic algorithm.; j ! 

• ' I ! i 

A nucleic add molecule encoding an acetylrCoA carboxylase homologous to a protein 

10 with an amino acid sequence of SEQ ID NO:3 can be created by introducing one or more 
nucleotide substitutions, additions or deletions into a nudeotide;sequence of the 
polynucleotide of the present invention, in 'particular of SEQ ID NO:2 such that one or 
more amino acid substitutions, additions o)r deletions are introduced into the encoded 
protein, Mutations can be introduced; intojthe sequences of, e.gj SEQ ID NO:2 by 
15 standard techniques, such as site-directed mutagenesis and PCR-inediated mutagenesis. 
Preferably, conservative amino acid substitutions are made at on? or more predicted non- 
essential amino acid residues. A "conservative amino acid substitution" is bne in which the 
amino acid residue is replaced with aaamino acid residue having a similar side chain. 
Families of amino acid residues having similar side chains have been defined in the art. 
These families include amino acids with ba$ic side chains (e.g., lysine arginine, histidine), 
acidic side chains (e.g., aspartic add, glutamic acid), uncharged polar side chains (e.g., 
glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains 
(e.g., alanine, valine, leucine, isoleudne, proline, phenylalanine, methionine, tryptophan), 
beta-branched side chains (e.g., threonine, Valine, isoleudne) an4 aromatic side chains 
(e.g., tyrosine, phenylalanine, tryptophan, Iiistidine), Thus, a predicted nonessential 
amino add residue in an acetyl-CoA carboxylase is preferably replaced with another amino 
add residue from the same femily. Alternatively, in another embodiment, mutations can 
be introduced randomly along all or part o£an acetyl-CoA carbo:$yIase coding sequence, 
such as by saturation mutagenesis, and the Resultant mutants can ibe screened for an acetyl- 
CoA carboxylase activity described herein t6 identify mutants that retain acetyl-CoA 
carboxylase activity. Following mutagenesis of one of the sequences of SEQ ID NO:2, the 
encoded protein can be expressed recombinantly and the activity ;of the protein can be 

determined using, for example, assays described herein. ) 

i 

A polynudeotide of the present invention, e.g., a nudeic acid molecule having a nudeotide 
sequence of SEQ ID NO:2, or a portioii thereof, can be isolated uding standard molecular 
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Preferably, the polypeptide of the invention comprises one of the nucleotide sequences 
shown in SEQ ID NO:2. The sequence ofSEQ ID NO:2 corresponds to the P. rhodozyma 
acetyl-CoA carboxylase cDNAs of thejnvefition. \ 

i " 5 

Further, the polynucleotide of the invention comprises a nucleic jacid molecule which is a 
5 complement of one of the nucleotide sequences of above mentioned polynucleotides or a 
portion thereof. A nucleic acid molecule which is complementary to one of die nucleotide 
sequences shown in SEQ ID NO:2 is ope which is sufficiently complementary to one of the 
nudeotidesequences shown in SEQ ID NO:2 such that it can hybridize to one of the 
nucleotide sequences shown in SEQ ID Nd:2, thereby forming a stable duplex. 

10 The "polynucleotide of the invention comprises a nucleotide sequence which is at least 
about 60%, preferably at least about 65-70%, more preferably at least about 70-80%, 80- 
90%, or 90-95%, and even more preferably 'at least about 95%, 96%, 97%, 98%, 99% or 
more homologous to a nucleotide sequence shown in SEQ ID NO:2, or a portion thereof. 
The polynucleotide of the invention comprises a nucleotide sequence which hybridizes, 

15 e.g., hybridizes under stringent conditions is defined herein, to ope of the nucleotide 
sequences shown in SBQ ID NO:2 $ or a portion thereot j 

Moreover, the polynucleotide of the invention can comprise onlyja portion of the coding . 
region of one of the sequences in SEQ JD NOS, for example a fragment which can be used 
as a probe or primer or a fragment encoding a biologically active portion of an acetyl-CoA 

20 carboxylase. The nucleotide sequences 5 determined from the clonjng of the acetyl-CoA 
carboxylase gene from P. rhodozyma allows for the generation of probes and primers 
designed for use in identifying and/or ^lonijig acetyl-CoA carboxylase homologues in 
other cell types and organisms. The prbbe/primer typically comprises a substantially 
purified oligonucleotide. The oligonucleotide typically comprised a region of nucleotide 

23 sequence that hybridfees under stringent conditions to at least about 12, 15 preferably 
about 20 or 25, more preferably about 40, 50 or 75 consecutive nu.cleotides.of a sense 
strand of one of the sequences set forth, e.g.J in SEQ ID NO: No:2{ an anti-sense sequence 
of one of the sequences, e.g., set forth in SEQ ID NO:2, or naturaljy occurring mutants 
thereof. Primers based on a nucleotide; of invention can be used in PCR reactions to clone 

30 acetyl-CoA carboxylase homologues. Probes based on the acetyl-CoA carboxylase 

nucleotide sequences can be used to detect transcripts or genomicjsequences encoding the 
same or homologous proteins. The probe can further comprise a jabel group attached 
thereto, e.g. the label group can be a radioisotope, a fluorescent compound, an enzyme, or 
an enzyme co-factor. Such probes can be used as a part of a genoihic marker test kit for 

33 identifying cells which express an acetyl-CoA carboxylase, such as by measuring a level of 
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an acetyl-CoA carboxylase-encoding nucleic acid molecule in a sample of cells, e.g., 
detecting acetyl-CoA carboxylase mRNA leVels or determining whether a genomic acetyl- 

CoA carboxylase gene has been mutated or deleted. ; 

s < 
j . \ 

The polynucleotide of the invention encodes a polypeptide or portion thereof which in- 
5 dudes an amino acid sequence Which is siiffidently homologous to an amino acid 
sequence of SEQ ID NO:3 such that the protein or portion thereof maintains an acetyl- 
CoA carboxylase activity, in particular; 1 an acetyl-CoA carboxylase 1 activity as described in 
the examples in microorganisms or plants, j As used herein, the language "sufficiently 
homologous" refers to proteins or portions, thereof which have amino add sequences 
10 which indude a minimum number of identical or equivalent (e.gl, an amino add residue 
which has a similar side chain as an aniino add residue in one of the sequences of the poly- 
peptide of the present invention aniino acid residues to an aminq' add sequence of SEQ ID 
NO:3 such that the protein or portion ■thereof has an acetyl-CoA Carboxylase activity. 
Examples of an acetyl-CoA carboxylase activity are also described herein. 
_ .• I | 

1 5 The protein is at least about 60-65%, preferably at least about 66->0%, and more prefer- 
ably at least about 70- 80%, 80-90%, 90-95%, and most preferabli at least about 96%, 
97%, 98%, 99% or more homologous to anjentire amino add sequence of SEQ ID NO:3. 

Portions of proteins encoded by the acetyl-CoA carboxylase polyAudeotide of the inven- 
tion are preferably biologically active portions of one of the acetyl-CoA carboxylase. 

20 As mentioned herein, the term "biologically! active portion of acetyl-CoA carboxylase" is 
intended to indude a portion, e.g., a domain/motif, that has acetyi-CoA carboxylase activi- 
ty or has an immunological activity such that it is binds to an anhjbody binding specifically 
to acetyl-CoA carboxylase. To determine whether an acetyl-CoA Carboxylase or a biologi- 
cally active portion thereof can partidpate in the metabolism an aUay of enzymatic activity 

25 may be performed. Such assay methods arejwdl known to those s'kuled in the art, as de- 
tailed in the Examples. Additional nucleic ajdd fragments encoding biologically active por- 
tions of an acetyl-CoA carboxylase canjbe prepared by isolating a portion of one of the 
sequences in SEQ ID NO;2, expressingthe encoded portion of the 1 acetyl-CoA carboxylase 
or peptide (e.g., by recombinant expression in vitro) and assessing* the activity of the en- 
30 coded portion of the acetyl-CoA carboxylase or peptide • 

• ; i 

At first, a partial gene fragment was cloned containing a portion of the ACC gene by using 
the degenerate PCR method. Said degenerate PCR is a method to jclone a gene of interest 
which has high homology of amino adjtsequence to theknown eizyme from other spedes 



Best Available Copy . 

i 

i k ! 

» ] 

which has die same or similar functioh. Degenerate primer, which is used as a primer in 
degenerate PCR, was designed by a reverse'translation of the amnio acid sequence to 
corresponding nucleotides ("degenerated"). In such a degenerate primer, a mixed primer 
which consists any of A, C G or T» or a primer containing inosine at an ambiguity code is 
5 generally used In this invention, such mixed primers were used jfor degenerate primers to 

clone above gene. ; j \ 

I j 
An entire gene containing its coding region with its intron as welj as its regulation region 
such as a promoter or a terminator can be cloned from a dvromo!$ome by screening of a 
genomic library which is constructed in phage vector or plasmid Vector in appropriate 
10 host, by using a partial DNA fragment obtained by degenerate P^R as described above as a 
probe after it was labeled. Generally, £• eoH as a host strain and 4 <»fi vector, a phage vec- 
tor such as X phage vector, or a plasmid vector such as pUC vechjr is often used in the 

i * 

construction of a library and a following genetic manipulation sufch as a sequencing, a re- 
striction digestion, a ligation and the like. In this invention, an JSfftRI genomic library of P. 

15 rhodozyma was constructed in the derivatives of X vector, XZAPH. An insert size, what 
length of insert must be cloned, was detenriined by the Southern 4>lot hybridization for the 
gene before construction of a library. In this invention, a DNA used for a probe was 
labeled with digoxigenin (DIG), a steroid hapten instead of conventional M P label, follow- 
ing the protocol which was prepared by the^upplier (Boehringer^Mannheim, Mannheim, 

20 Germany). A genomic library constructed from the chromosome of P. rhodozyma was 
screened by using a DKMabeled DNA fragment which had a portion of a gene of interest 
as a probe. Hybridized plaques were picked up and used for further study. When XZAPII 
(insert size was below 9kb) was used in the construction of the genomic library, in vivo ex- 
cision protocol was conveniendy used for the succeeding step of the cloning into the plas- 

25 mid vector by using a derivative of single stranded Mia phage, Ex; assist phage (Stratagene, 
La Jolla, USA). A plasmid DNA thus obtained was examined for sequencing. 

I 

In this invention, we used the automated fluorescent DNA sequencer, ALFred system 
(Pharmacia, Uppsala, Sweden) using an autocyde sequencing protocol in which the Taq 
DNA polymerase is employed in most cases'of sequencing. ■ 

- ] i 

30 After the determination of the genomic sequence, a sequence of a coding region was used 
for a cloning of cDNA of corresponding geiie. The PCR method was also exploited to 
clone cDNA fragment. The PCR primers whose sequences were identical to the sequence 
at the 5'- and 3'- end of the open reading frame (ORF) were synthesized with an addition 
of an appropriate restriction site, and P£R was performed by using those PCR primers. In 
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The present invention further relates to a vector in which the polynucleotide of the present 
invention is operatively linked to expression control sequences allowing expression in pro- 
karyotic or eukaryotic host cells. The nature of such control sequences differs depending 
upon the host organism. In prokaryotes, control sequences generally include promoter, 
5 ribosomal binding site, and terminators* In eukaryotes, generally control sequences in- 
clude promoters, terminators and, in somej instances, enhancers Jtransactivators; or tran- 
scription factors, | ? ; 

; ! 1 

The term "control sequence" is intended to include, at a minimum, components the pre- 
sence of which are necessary for expression', and may also include additional advantageous 
10 components. i \ 

> 

The term "operably linked" refers to a juxtajposition wherein the components so described 
are in a relationship permitting them fo function in their intended manner. A control 
sequence "operably linked" to a coding sequence is ligated in sucht a way that expression of 
the coding sequence is achieved under conditions compatible with the control sequences. 
15 In case the control sequence is a promoter, it is obvious for a slatted person that double- 
stranded nucleic acid is used. \ J 

Regulatory sequences include those wljich direct constitutive expression of a nucleotide 
sequence in many types of host cell an£ those which direct expression of the nucleotide 
sequence only in certain host cells or underjcertain conditions, li will be appreciated by 
20 those skilled in the art that the design of the expression vector cart depend on such factors 
as the choice of the host cell to be transformed, the level of expression of protein desired, 
etc. The expression vectors of the invention can be introduced into host cells to thereby 
produce proteins or peptides, including fusion proteins or peptides, encoded by poly- 

* K 

nucleotides as described herein. ' f ■ > 

'■• 5 ! 

25 The recombinant expression vectors of the invention can be designed for expression of 
acetyl-CoA carboxylase in prokaryoticbr eiikaryotic cells. For example, genes encoding the 

i ■ • 

polynucleotide of the invention can be.expressed in bacterial cell$$uch as R coli, insect 
cells (using Wulovirus expression vectors);; yeast and other fongdl cells, algae, ciliates of 
die types: Holotrichia, Peritrichia, Spirotrichia, Suetoria, Tetrdhymena, Paramecium, Colpi- 
30 dium. Glaucoma, Platyophrya, Potomacus, P$eudocohnikmbu$, Euphtes, Engelmaniella, and . 
Styltmychia, especially Stylonychia lemriae with vectors following, a transformation method 
as described in WO9801572 and multicellular plant cells. Alternatively, the recombinant 
expression vector can be transcribed and translated in vitro> for ©Sample using T7 promo- 
ter regulatory sequences and T7 polymerase. \ 
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utilized in the bacterium chosen for expression, such as R colu sSuch alteration of nucleic 
acid sequences of the invention can be carried out by standard DKA synthesis techniques. 

Further, the acetyl-CoA carboxylase vector; can be a yeast expression vector. Examples of 
vectors for expression in yeast S. cerevisiae jnclude pYepSecl, pMFa, pJRY88, andpYES2 
S (Jnvitrogen, San Diego, USA). Vectors andrmethods for the construction of vectors appro- 
priate for use in other fungi, such as d»e filamentous fungi, are known to the skilled arti- 

san; j 

\ 

» i • 
Alternatively, the polynucleotide of the invention can be introduced in insect cells using 

baculovirus expression vectors. Baculovirus vectors available for expression of proteins in 

10 cultured insect cells (e.g., Sf 9 cells) include the pAc series and the pVL series. 

i ■ \ 

Alternatively, the polynucleotide of the invention is introduced in mammalian cells using a 
mammalian expression vector. Examples 6f mammalian expressjon vectors include 
pCDM8 and pMT2PC. When used in mammalian cells, the expression vector's control 
functions are often provided by viral regulatory elements. For exkmple, commonly used 
15 promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 

40. ! i ; 

"« 5 I, 
The recombinant mammalian expression vector can be capable of directing expression of 

the nucleic acid preferentially in a particular cell type (e.g., tissue-jspcdfic regulatory ele- 
ments are used to express the nucleic &d&).\ Tissue- specific regulatory elements are 

20 known in the art. Non-limiting examples of suitable tissue-specific promoters include the 
albumin promoter (liver- specific), lymphoid-specific promoters^ in particular promoters 
of T cell receptors and immunoglobulins, nj&uron-spedfic promoters (e.g.j the 
neurofilament promoter)* panoreas-spedfui promoters, and martfmary gland-specific 
promoters (e.g„ milk whey promoter; US 4i873, 316 and EP 264,i66). Developmental^- 

25 regulated promoters are also encompassed, for example the murine hox promoters and the 
fetoprotein promoter. \ [ : 

• " 4 

I « 

Thus expressed ACG gene can be verified for its activity, e.g., by ah enzyme assay method. 
Some experimental protocols are described in the literature. The following is the one of 
the methods which is used for the determination of acetyl-CoA carboxylase activity: Assays 
30 are performed by measuring the loss in acetyl-Co A and/or the production of malonyl-CoA 
at 5 min intervals for 20 min, using reverse phase HPLC The ratejof conversion of acetyl- 
CoA to malonyl-CoA is found to be linear for 20 min, and velocities are calculated by line- 
ar regression analysis of the malonyl-CoA concentration with respect to time. The 

; t i 
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gene fragment would form a comple>c : with : a mature mRNA fragpient of the objective gene 
in vivo and inhibit an efficient translation from mRNA, as a consequence. 

! i 

9 

\ I 

An "antisense" nucleic acid molecule comprises a nucleotide sequence which is comple- 
mentary to a "sense" nucleic acid molecule;encoding a protein, e^ g., complementary to the 
5 coding strand of a double-stranded cDNA molecule or complementary to a mRNA 
sequence. Accordingly, an antisense iiudric add molecule can hydrogen bond to a sense 
nucleic acid molecule. The antisense nucleic acid molecule can bje complementary to an 
entire acetyl-CoA carboxylase-coding straijd, or to only a portion thereof. Accordingly, an 
antisense nucleic add molecule can be antisense to a "coding region" of die coding strand 

10 of a nucleotide sequence encoding an acetyl-CoA carboxylase. The term "coding region" 
refers to the region of the nucleotide sequence comprising codoijs which are translated 
into amino acid residues* Further, the antfeense nucleic add mojecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence encoding acetyl-CoA 
carboxylase. The term "noncoding region" refers to 5 1 and 3' sequences which flank the 

15 coding region that are not translated ihto aj polypeptide (i.e., also referred to as 5 1 and 3' 
untranslated regions). | j 

Given the coding strand sequences encoding acetyl-CoA carboxylase disclosed herein, 

antisense nucleic acid molecules of the invention can be designed according to the rules of 

Watson and Crick base pairing. The antisense nucleic acid molecule can be complement 

20 tary to the entire coding region of acetyi-CpA carboxylase mRNA, but can also be an oligo- 

nucleotide which is antisense to only a portion of the coding or rjoncoding region of 

acetyl-CoA carboxylase mRNA. For example, the antisense oligonucleotide can be com- 

plementary to the region surrounding the translation start site of acetyl-CoA carboxylase 

mRNA. An antisense oligonudeotidecan lie, for example, about* 5, 10, 15, 20, 25, 30, 35* 

, l \ 
25 40, 45 or 50 nucleotides in length. An antisense nucleic acid molecule of the invention can 

* \ ' 

be constructed using chemical synthesis and enzymatic ligation reactions using procedures 

known in the art. For example, an antisense nucleic acid molecule (e.g., an antisense 

oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or 

variously modified nucleotides designed to increasc the biological stability of the molecules 

30 or to increase the physical stability of the duplex formed between? the antisense and sense 

nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can 

r > 
be used. Examples of modified nucleotide* which can be used ^generate the anti-sense 

nucleic acid include 5-fluorouradl, 5-bromouracil, 5-chlorouracjl, 5-iodouracil, hypo- 

xanthine, xanthine, 4-acetylcytosine, 5-(car>oxyhydroxylmethyl)! uracil, 5-carboxymethyl- 

35 aminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta- 
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In addition to naturally-occurring variants' of the acetyl-CoA carboxylase sequence that 
may exist in the population, the skilled artisan will further appreciate that changes can be 
introduced by mutation into a nucleoside sequence of the polynucleotide encoding acetyl- 
CoA carboxylase, thereby leading to changes in the amino acid sequence of the encoded 
5 acetyl-CoA carboxylase, without altering the functional ability o( the acetyl-CoA carboxyl- 
ase.* For example, nucleotide substitutions leading to amino acid substitutions at "non- 
essential" amino acid residues can be made in a sequence of the polynucleotide encoding 
acetyl-CoA carboxylase, e.g. SEQ ID NO:2.; A "non- essential" amino add residue is a resi- 
due that can be altered from the wild-type sequence of one of the acetyl-CoA carboxylase 
10 without altering the activity of said acetyl-CoA carboxylase, whereas an "essential 11 amino 
acid residue is required for acetyl-CoA carboxylase activity. Oth^r amino acid residues, 
however, (e.g.> those that are not conserved or only semi-conservied in the domain having 
acetyl-CoA carboxylase activity) may hot be essential for activity knd thus are likely to be 
amenable to alteration without altering acetyl-CoA carboxylase activity, 

: \ j 

15 Accordingly, the invention relates to pblynucleotides encoding acetyl-CoA carboxylase 
that contain changes in amino acid residues that are not essentialtfor acetyl-CoA • 
carboxylase activity. Such acetyl-CoA carboxylase differs in amino acid sequence from a 
sequence contained in SEQ ID NO:3 yet retain the acetyl-CoA carboxylase activity 
described herein. The polynucleotide can comprise a nucleotide sequence encoding a 

20 polypeptide, wherein the polypeptide comprises an amino acid sequence at least about 
60% identical to an amino acid sequence oiSEQ ID NO:3 and has acetyl-CoA carboxylase 
activity. Preferably, the protein encoded by #ie nucleic add moIe<jule is at least about 60- 
65% identical to the sequence in SEQ tp NO:3, more preferably «fc least about 60-70% 
identical to one of the sequences in SEQ ID;NO;3, even more preferably at least about 70- 

25 80%, 80- 90%, 90-95% homologous tojthe sequence in SEQ ID ND'3, and most preferably 
at least abotit 96%, 97%, 98%, or 99% identical to the sequence irt SEQ ID NO:3. 

: « i 

To determine the percent homology of|two amino acid sequences«(e.g., one of the se- 

» ■ f 

quence of SEQ ID NOi3 and a mutant form;thereof) or of two nufcleic acids, the sequences 
are aligned for optimal comparison purposes (e.g., gaps can be introduced in the sequence 

30 of one protein or nucleic acid for optiirial alignment with the other protein or nucleic 
acid). The amino acid residues or nucleotides at corresponding ainino acid positions or 
nucleotide {positions are then compared When a position in one Sequence (e.g., one of the 
sequences of SEQ ID NO:2 or 3) is occupied by the same amino a#d residue or nucleotide 
as the corresponding position in the other sequence (e,g,, a mutant form of the sequence 

35 selected), then the molecules are homologous at that position (Le.j as used herein amino 

i 

f \ 
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biology techniques and the sequence information provided herein. For example, acetyl- 
CoA carboxylase cDNA can be isolated from a library using all or portion of one of the 
sequences of the polynucleotide of the present invention as a hybridization probe and 
standard hybridization techniques- Moreover, a polynucleotide encompassing all or a por- 
tion of one of the sequences of the polynucleotide of the present invention can be isolated 
by the polymerase chain reaction using oligonucleotide primers designed based upon this 
sequence (e.g„ a nucleic add molecule encompassing all or a portion of one of the sequen- 
ces of polynucleotide of the present invention can be isolated by ihe polymerase chain 

reaction using oligonucleotide primerS, e.g; of SEQ ID NO:4, 5, or 6, designed based upon 

1 i j 

this same sequence of polynucleotide of the present invention. Fpr example, mRNA can 

be isolated from cells, e.g. Phaffia (e.g.> by the guanidinium-thioqyanate extraction 

procedure of Chirgwin et al. and cDNA caii be prepared using reverse transcriptase (e.g., 

Moloney MLV reverse transcriptase oi AMV reverse transcriptases available from Promega 

(Madison, USA)). Synthetic oligonucleotide primers for polymerase chain reaction 

amplification can be designed based upon one of the nucleotide sequences shown in SEQ 

ID NO:2. A polynucleotide of the inventioi can be amplified usihg cDNA or, 

alternatively! genomic DNA, as a template ind appropriate oligonucleotide primers 

according to standard PCR amplification techniques. The polynucleotide so amplified can 

be cloned into an appropriate vector and characterized by DNA sbquence analysis. 

Furthermore, oligonucleotides corresponding to an acetyl-CoA carboxylase nucleotide 

sequence can be prepared by standard synthetic techniques, e.g., Rising an automated DNA 

synthesizer. : ? 

M 1 

Theterms "fragment", "fragment of a s]equepce" or "part of a sequence" means a truncated 
sequence of the original sequence referred to. The truncated sequence (nucleic acid or 
protein sequence) can vary widely in length; the minimum size b$ng a sequence of suffi- 
cient size to provide a sequence with at least a comparable function and/or activity of the 
original sequence referred to, while the maximum si2e is not critical. In some applications, 
the maximum size usually is not substantially greater than that required to provide the 
desired activity and/or function(s) of the original sequence, \ 

: > I 
j , f 

Typically, the truncated amino acid sequence will range from about 5 to about 60 amino 
acids in length. More typically, however, the sequence will be a maximum of about 50 
amino adds in length, preferably a maximum of about 30 amino acids. It is usually desir- 
able to select sequences of at least about 10, i!l2 or 15 amino acids, Jup to maximum of 

about 20 or 25 amino acids. • I 

• i \ 
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The term "epitope" relates to specific immunoreactive sites withip an antigen, also known 

i ' * 

as antigenic determinants. These epitopes can be a linear array of monomers in a poly- 
meric composition - such as amino adds in a protein - or consist of or comprise a more 
complex secondary or tertiaiy structure. Tftose of skill will recognize that all iramunogens 
(I e., substances capable of eliciting aA immune response) are antigens; however, some 
antigen, such as haptens, are not immunogens but may be made immunogenic by 
coupling to a carrier molecule. The term "Antigen" includes references to a substance to 
which an antibody can be generated and/or to which the antibodjr is specifically 

immunoreactive. : ! 

* . * 

The term "one or several amino acids" relates to at least one amiqo acid but not more than 
that number of amino adds which would result in a homology of below 60% identity. 
Preferably, the identity is more than 70% or 80%, more preferred are 85%, 90% or 95%, 

even more preferred are 96%, 97%, 98&, of 99% identity, j 

\ \ 

The term "acetyl-CoA carboxylase" or "acety]-CoA carboxylase adtivit/ 1 relates to enzyma- 
tic activities of a polypeptide as described bfclow or which can be determined in enzyme 
assay method. Furthermore, polypeptides that are inactive in an assay herein but are re- 
cognized by an antibody specifically binding to acetyl-CoA carboxylase, i.e., having one or 
more acetyl-CoA carboxylase epitopes,; are ^lso comprised under hie term "acetyl-CoA 

carboxylase". In these cases activity refers t6 their immunological activity. 

i i 

* — ■ ■ . - • : . . ? < • 

The terms "polynucleotide" and "nucleic acid molecule" also relate to "isolated" poly- 
nucleotides or nucleic acids molecules.; An "isolated" nucleic acid molecule is one which is 
separated from other nucleic acid molecules which are present in the natural source of the 
nucleic acid. Preferably, an "isolated" nucleic acid is free of sequences which naturally 
flank the nudeic acid (Le., sequences located at the 5' and 3' ends kf the nucleic acid) in the 
genomic DNA of the organism from which the nucleic add is derived 

* 

For example, in various embodiments, the I*NO polynucleotide cin contain less than 
about 5 kb, 4kb, 3kb, 2kb, I kb, 0.5 kb 6r 0.1 kb of nucleotide sequences which naturally 
flank the nucleic acid molecule in genomic DNA of the cell from Which the nucleic acid is 
derived (e.g., a Phaffia cell). Moreover, 1 the polynucleotides of theipresent invention, in 
particular an "isolated" nucleic acid molecule, such as a cDNA molecule, can be substan- 
tially free of other cellular material, or dultute medium when produced by recombinant 
techniques, or chemical precursors or other chemicals when chemically synthesized. 

s . 

t 
i 
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D-galactosylqucosine, inosine, N6-isopentenyladenine, 1-methyjguanine, 1-methylinosine, 
2,2-dimethylguanine, 2-methy]adenjuie,.2-pie%lguanine, 3-memylcytosine, 5-methyl- 
cytosine, N6-adenine, 7-methylguanine, 54methykminomethyli*aeil, 5-methoxyarnino- 
methyl-2-thiouracfl, beta-D-mannosylqueosine, S'-metfaoxycarlixymethyluradl, 5-raeth- 
5 oxyuracil, 2-methylthio-N6-isopenteiiyladpine, uracfl-5 : oxyac4tic acid (v), 

wybutoxosine, pseudouracil, queuing 2-twocytosine, 5-methyi;>thiouracil, 2-thiouracil, 
4-thiouracil, 5-methyluracfl, uradl-5-;oxya*etic acid methylesterjuradl-5-oxyacetic acid 
(v), 5-methyl-2-thiouradl, 3-(3-aminb-3-N-2-carboxypropyl) ufadl, (aep3)w, and 2,6- 
dianMnopurine. Alternatively, the antisense nucleic acid can be produced biologically 
) using an expression vector into which 5 a polynucleotide has been;subdoned in an antisense 
orientation (i.e.; JRNA transcribed from the inserted polynucleotide will be of an antisense 
orientation to a target polynucleotide of interest, described further in the following 
subsection), ! J 1 

• :' f ! 
The antisense nucleic acid molecules of the invention are typically administered to a cell or 
generated in situ such that they hybridize with or bind to cellultuj mRNA and/or genomic 
DNA encoding an acetyl-CoA carboxylase to thereby inhibit expression of the protein, e.g., 
by inhibiting transcription and/or translation. The hybridization can be by conventional 
nucleotide complementarity to form a stable duplex, or, for example, in the case of an 
antisense nucleic acid molecule which binds to DNA duplexes, through specific 
interactions in the major groove of the doiible helix, The anti-sense molecule can be 
modified such that it specifically binds to a receptor or an antigen expressed on a selected 
cell surface, e.g., by linking the antisense nucleic add molecule to! a peptide or an antibody 
which binds to a cell surface receptor or antigen. The antisense ijudeic add molecule can 
also be delivered to cells using the vectors described herein. To achieve suffident 
intracellular concentrations of the antisensfe molecules, vector constructs in which the 
antisense nuddc add molecule is placed uider the control of a strong prokaryotic, viral, 
or eukaryotic induding plant promoters are preferred. \ 

: ) 

The antisense nuddc acid molecule of the invention may, e.g., bl an o-anomeric nudeic 
add molecule. An o-anomeric nuddc acid molecule forms speqfic double-stranded 
hybrids with complementary RNA in which, contrary to the usual 0-units, the strands run 
paralld to each other. The antisense niideic add molecule can alio comprise a 
2'-o-methylribonudeotide or a chimeric RNA-DNA analogue. { 



Further the antisense nuddc add molecule of the invention can (>e a ribozyme. Ribozymes 
are catalytic RNA molecules with ribonudease activity which are Capable of deaving a 
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! 

single-stranded nudeic add, such as a mRNA, to which they havp a complementary 
region. Thus, ribozymes (e.g., hammerhead ribozymes) can be used to catalytically cleave 
acetyl-CoA carboxylase mRNA ttanscripts to thereby inhibit translation of mRNA. A 
ribozyme having specificity for an acetyl-CoA carboxylase-encoding nudeic acid molecule 

5 can be designed based upon the nucleotide sequence of an acetyl-CoA carboxylase cDNA 
disdosed herein or on the basis of a heterologous sequence to bsisolated according to 
methods taught in this invention. For example, a derivative of ajTetrahymena L-19 IVS 
RNA can be constructed in which the nucleotide sequence of thejactive site is complemen- 
tary to the nudeotide sequence to be cleaved in an encoding mRNA (see, e.g., US 

10 4,987,071 and US 5,1 16,742). Alternatively, acetyl-CoA carboxykse mRNA can be used to 
select a catalytic RNA having a specific ritynudease activity from a pool of RNA 

molecules. i • 

t ! j 

The application of the antisense method to construct a carotenoijd overproducing strain 
from P. rhodozyma is disclosed in EP l,158i051. \ • 

js In one embodiment the present invention relates to a method of making a recombinant 
host cell comprising introdudng the vector or the polynucleotide of the present invention 
into a host cell. : i 

t i 

; i * 

Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional trans- 
formation or transfection techniques. As u?ed herein, the terms "(transformation" and 

20 "transfection", conjugation and transduction are intended to refer to a variety of art- 
recognized techniques for introdudng foreign nudeic add (e.g., pNA) into a host cell, 
including calcium phosphate or calcium chloride co-predpitation, DEAE-dextrao- 
mediated transfection, lipofection, natural fcompetence, chemical-mediated transfer, or 
electroporation. Suitable methods for. transforming or transfectipg host cells induding 

25 plant cells are known to the skilled artisan. < I 

: ! I 

For stable transfection of mammalian icells, only a smaB fraction of cells may integrate the 
foreign DNA into their genome, depending upon the expression vector and transfection 
technique used. In order to identify and select these integrants, aigene that encodes a 
selectable marker (e.g., resistance to antibiotics) is generally introduced into the host cells 
30 along with the gene of interest Preferred selectable markers indude those which confer 
resistance to drugs, such as G418, hygromyjcin and methotrexate* Nucleic acid encoding a 
selectable marker can be introduced into a host cdl on the same vector as that encoding 
the polypeptide of the present invention or-can be introduced on! a separate vector. Cells 
stably transfected with the introduced-nucleic add can he identified by, for example, drug 
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this invention, a cDNA pool was usedjas a template in this PCR cloning of cDNA. The 
said cDNA pool consists of various cDNA species which were synthesized w vitro by the 
viral reverse transcriptase and Taq polymerase (CapFinder Kit manufactured by Clontech, 
Palo Alto, U.SA.) by using the mRNA obtained from P. rhodozyw as a template. cDNA 
i of interest thus obtained was confirmed in its sequence. Furthermore, cDNA thus 
obtained was used for a confirmation of its-enzyme activity after the cloning of the cDNA 
fragment into an expression vector wtiich functions in £ coli under the strong promoter 
activity such as the lac or T7 expression system. \ 

In another embodiment, the present invention relates to a method for making a recombi- 
nant vector comprising inserting a polynucleotide of the invention into a vector. 

Further, the present invention relates to a recombinant vector containing the polynucleo- 
tide of the invention or produced by said method of the invention. 

• ! 

; » • 

As used herein, the terra Vector" refers to a nudeic acid molecule' capable of transporting a 
polynucleotide to which it has been linked. I One type of vector is a "plasmid", which refers 
to a circular double stranded DNA loop into which additional DNA segments can be 
ligated. Another type of vector is a viral vector, wherein additional DNA or PNA segments 
can be ligated into the viral genome. Certaii vectors are capable of autonomous replica- 
tion in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial 
origin of replication and episomal mammalkn vectors). Other vectors (e.g., non-episomal 
mammaliah vectors) "are integrated into the?genome of a host celljupon introduction iito 
the host cell, and thereby are replicated along with the host genome. Moreover, certain 
vectors are capable of directing the expression of genes to which they are operatively 
linked. Such vectors are referred to herein as "expression vectors^ In general, expression 
vectors of utility in recombinant DNA techniques are often in the form of plasmids. In the 
present specification, "plasmid" and "vkctoj can be used interchangeably as the plasmid is 
the most commonly used form ofvector. riowever, the invention'is intended to include 
such other-forms of expression vectors, sue* as viral vectors (e.g., Replication defective 
retroviruses, adenoviruses and adeno-associated viruses), which slrve equivalent 
functions. • j 

* • > 

• ! S 

The present invention also relates to co'smids, viruses, bacteriophages and other vectors 
used conventionally in genetic engineering mat contain a nucleic add molecule according 
to the invention. Methods which are well known to those skilled L the art can be used to 
construct various plasmids and vectors. Alternatively, the nudeiejadd molecules and vec- 
tors of the invention can be reconstituted into liposomes for delivery to target cells. 
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Expression of proteins in prokaryotes is most often carried out With vectors containing 
constitutive or inducible promoters dh-ecti'ng the expression of either fusion or non-fusion 
proteins. Fusion vectors add a number of k mino adds to a protein encoded therein, usual- 
ly to the amino terminus of the recombinant protein but also to the C-terminus or fused 

5 within suitable regions in the proteins. Such fusion vectors typically serve three purposes: 
1) to increase expression of recombinant pjrotein; 2) to increase the solubility of the 
recombinant protein; and 3) to aid in the purification of the recombinant protein by 
acting as a ligand in affinity purification. Often* in fusion expression vectors, a proteolytic 
cleavage site is introduced at the junction of the fusion moiety an'd the recombinant 

) protein to enable separation of the recombinant protein from the fusion moiety 
subsequent to purification of the fusion protein. Such enzymes, ajnd their cognate 
recognition sequences, include Factor Xa, thrombin and enterokinase. 

Typical fusion expression vectors include pjGEX (Pharmacia Biotech Inc.), pMAL (New 
England Biolabs, Beverly, USA) and pWT5;(Pharmacia, Piscatawly, USA) which fuse glu- 
tathione S-transferase (GST), maltose B binding protein, or protein A, respectively, to the 
target recombinant protein. In one embodiment, the coding sequence of the polypeptide 
encoded by the polynucleotide of the present invention is cloned,mto a pGEX expression 
vector to create a vector encoding a fusion protein comprising, frLm the N-terminus to the 
C-terminus, GST-thrombin cleavage site-Xprotdn. The fusion protein can be purified by 
affinity chromatography using glutathione-|agarose resin, e.g. recombinant acetyl-CoA 
carboxylase unfused to GST can be recovered by cleavage of the fusion protein with 
thrombin. ! ' 

i 

! t 

t 

Examples of suitable inducible non-fusion R coli expression vectors include pTrc and pET 
lid. Target gene expression from the pTrc Vector relies on host BNA polymerase tran- 
scription from a hybrid trp-lac fusion promoter. Target gene expression from the pET lid 
vector relies on transcription from a 17 ffiXO-lac fusion promoter^ mediated by a coex- 
pressed viral RNA polymerase (T7 gnty This viral polymerase is supplied by host strains 
BL21(DE3) or HMS174(DE3) from a resident X prophage harboring a T7 gnl gene under 
the transcriptional control of the lacVV 5 promoter. ; 

■- f j 
One strategy to maximize recombinant protein expression is to express the protein in host 
bacteria with an impaired capacity to proteolytically cleave the recombinant protein. An- 
other strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an 
expression vector so that the individuaj codons for each amino ack are those preferentially 



» 5 

Best Available Copy ' 

i >is- I 

reaction mixture contained 50 mM Tris, pti 7.5, 6 uM acetyl-CoA, 2 mM ATP, 7 mM 
KHC0 3 , 8 mM MgCl* 1 mM dithiothreitoi, and 1 mg/ml bovine-serum albumin. Enzyme 
is preincubated (30 min, 25°C) with bovin^ serum albumin (2 m£/ml) and potassium 
citrate (10 mM). Reactions are initiated by] transferring 50 ul of^reincubated enzyme to 
5 the reaction mixture (final volume 200 ul) and incubated for 5-20 min at 25°C. Reactions 
are terminated by addition of 50 ul 10% pekhloric acid. Following termination of the 
reaction, the samples are centrifuged (3 min, 10,000 X g) and analyzed by HPLC A mobile 
phase of 10 mM KH 2 P0 4 , pH 6.7 (solvent A), and MeOH (solveijt B) is used. The flow 
rate is 1.0 ml/min, and the gradient is as follows: hold at 100% solvent A for 1 mm 
10 followed by a linear gradient to 30% solvent B over the next 5 min, then hold at 30% 
solvent B for 5 min. Using this method P retention times were 7.5 and 9.0 min for 
malonyl-CoA and acetyl-CoA, respectively; When an expression vector. fori cerevisiaeis 
used, a complementation analysis can be conveniently exploited iy using conditional 
acetyl-CoA carboxylase null mutant stjrain ^erived from 5. cereviliae as a host strain for its 
15 confirmation of activity. ■ ! 



< 



Succeeding to the confirmation of the tenzyjne activity, an expressed protein would be 
purified and used for raising the antibddy against the purified enzyme. Antibody thus 
prepared would be used for a charactejizatibn of the expression of the corresponding 
enzyme in a strain improvement study, an Optimization study of ihe culture condition, 

20 and the like. ! f 

• i ; 

; - • • . . • ..: . . . .: 

In a further embodiment, the present invention relates to an antibody mat binds 

specifically to the polypeptide of the presenj: invention or parts, i.jj. specific fragments or 
epitopes of such a protein. ; ] ; 

M ! 

The antibodies of the invention can beiused to identify and isolate other acetyi-CoA carb* 
25 oxylase and genes. These antibodies can beimonodonal antibodies, polyclonal antibodies 
or synthetic antibodies as weH as fragnjentsjof antibodies, such asbab, Fv or scFv frag- 
ments etc Monoclonal antibodies canibe prepared; for example, jby the techniques as ori- 
ginally described by Kohler and Milstein, wjiich comprise the fusion of mouse myeloma 
cells to spleen cells derived from immunized mammals. ! 

; ? ! 

30 Furthermore, antibodies or fragments thereof to the aforementioned peptides can be ob- 
tained by using methods known to the>killed person. These antibodies can be used, for 
example, for the immunopredpitauWand immunolocalization df proteins according to 
the invention as well as for the monitoring ^f the synthesis of suclji proteins, for example, 
in recombinant organisms, and for the identification of compounds interacting with the 
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protein according to the invention. For example, surface plasmcjn resonance as employed 
in the BlAcore system, can be used to increase the efficiency of pliage antibodies selections, 
yielding a high increment of affinity from i single library of phage antibodies which bind 
to an epitope of the protein of the invention. In many cases, the binding phenomenon of 
5 antibodies to antigens is equivalent toother ligand/anti-ligand binding. 

; j * i 

In this invention, the gene fragment for acetyl-CoA carboxylase *ras cloned from P. rhodo- 
zyma with a purpose to decrease its expression level in P„ rhodozyma by genetic method 
using the cloned gene fragment \ \ ' 

To decrease a gene expression with gepeticlmethods, some strategies can be employed, one 
10 of which is a gene-disruption methodi In tjhis method, a partial fragment of the objective 
gene to be disrupted is ligated to a dnig resistant cassette on the integration vector which 
can not replicate in the host organism! A drug resistance gene which encodes the enzyme 
that enables the host to survive in the presence of a toxic antibiotic is often used for the 
selectable marker. G418 resistance gene hairbored in pGB-Ph9 (Wery et al (Gene, 184, 89- 
15 97, 1997)) is an example of a drug resistance gene which functions in P. rhodozyma. 

Nutrition complementation marker can belalso used in the host which has an appropriate 
auxotrophy marker. P. rhodozyma ATCC24221 strain that requites cytidine for its growth 
is one example of the auxotroph. By using CTP synthetase as donor DNA for ATCC24221, 
a host vector system using a nutrition complementation can be established. 

I : i 

20 After the transformation of the host organisms and recombination between the objective 
gene fragment on the vector and its corresponding gene fragment on the chromosome of 
the host organisms, the integration vector is integrated onto the liost chromosome by 
single cross recombination. As a result of this recombination, the drug resistant cassette 
would be inserted in the objective gene whdse translated produces only synthesized in its 

25 truncated form which does not have its enzymatic function. In a|similar manner, two 
parts of the objective gene were also used for gene disruption study in which the drug 
resistant gene can be inserted betweenisuch* two partial fragments of the objective genes on 
the integration vector. In the case of this tyjpe of vector, double recombination event 
between the gene fragments harbored pn the integration vector ahd the corresponding 

30 gene fragments on the chromosome of the host are expected. Although frequency of this 
double crossing-over recombination fa lower than single cross re<jombination, null 
phenotype of the objective gene by the double cross recombination is more stable than by 
the single cross recombination. ; j 
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On the other hand, this strategy has difficulty in the case of the gene whose function is 
essential and disruption is lethal for the host organism such as acfetyl-CoA carboxylase 
gene. The function of acetyl-CoA carboxylase is indispensable far the host survival other 
than the biosynthesis of fatty acid. From such a viewpoint, it seemed to be difficult to con- 
3 struct the acetyl-CoA carboxylase disrpptant from P. rhodozyma by this gene disruption 
method. i • \ 

* 7 
I J 

\ 

In such a case, other strategies can be applied to decrease (not to disrupt) a gene expres- 
sion, one of which is a conventional niutagimesis to screen the mutant whose expression 
for acetyl-CoA carboxylase is decreased. In this method, an appropriate recombinant in 
10 which an appropriate reporter gene is piseA to the promoter region of acetyl-CoA carb- 
oxylase gene from the host organism i? mutated and mutants which show a weaker activity 
of reporter gene product can be screened. In such mutants, it is expected that their expres- 
sion of acetyl-CoA carboxylase activity decreased by the mutation lying in die promoter 
region of reporter gene or fnms-acting region which might affectlthe expression of acetyl- 
15 CoA carboxylase gene other than the mutation lying in the prompter gene itself In the 
case of mutation occurring at the promoter region of the reporter fusion, such mutation 
can be isolated by the sequence of the Corresponding region. ThOs isolated mutation can 
be introduced in a variety of carotenoids, especially astaxanthin producing mutants 
derived from P. rhodozyma by a recombination between the original promoter for acetyl- 
20 CoA carboxylase gene on the chromosome |and the mutated promoter fragment. To 

exclude mutations occurring at a fcran$~actijig region, a mutation-can also be induced by an 
in vitro mutagenesis of a cis element in the promoter region. In tjiis approach, a gene 
cassette, containing a reporter gene which is fused to a promoter Region derived from a 
gene of interest at its 5'-end and a terminator region from a genelof interest at its S'-end, is 
25 mutagenized and then introduced intb P. rhodozyma. By detecting the difference of the 
activity of the reporter gene, an effective mutation can be screenejd. Such a mutation can 
be introduced in the sequence of the native promoter region on tjie chromosome by the 
same method as the case of an in vivo mutation approach. But, these methods have some 

drawbacks to have some time-consuming process. 

• ; ; : s ■ 

; . . j 

30 Another strategy- to decrease a gene expression is an antisense method. This method is fre- 
quently applied to decrease the gene expression even when teleomorphic organisms such 
as P. rhodozyma ate used as host organisms, to which the mutaticjn and gene disruption 
method is usually difficult to be applied. The anti-sense method Is a method to decrease 
an expression of gene of interest by mtrodiiring an artificial gene; fragment, whose 

35 sequence is complementary to cDNA fragment of the gene of interest Such an anti-sense 
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: • ; 
extrachromosomally. In this respect, it is also to be understood that the nucleic acid mole- 
cule of the invention can be used to restore or create a mutant gene via homologous re- 
combination. ; 

* ♦ 

Accordingly, in another embodimentthe present invention relates to a host cell genetically 
engineered with the polynucleotide of the invention or the vector of the invention. 

i 

The terms "host cell" and "recombinant host cell" are used interchangeably herein. It is 
understood that such terms refer not only to the particular subject cell but to the progeny 
or potential progeny of such a cell Because certain modifications may occur in succeeding 
generations due to either mutation or environmental influences,*uch progeny niay not, in 
feet, be identical to the parent cell, but are still included within the scope of the term as 
used herein. j < 

i 

For example, a polynucleotide of the present invention can be introduced in bacterial cells 
as well as insect cells, fungal cells or mammalian cells (such as Chinese hamster ovary cells 
(CHO) or COS cells), algae, ciliates, plant cells, fungi or other raicroorganims like B. coli. 
Other suitable host cells are known to those skilled in the art Preferred are R coli, baculo- 
virus, Agrobacterium or fungal cells are, for example, those of thfe genus Saccharomyces, 

e.g. those of the species S. cerevisiaz orP. rhodozyma (Xanthophylomyces dendrorhous). 

• » 

r 

5 

In addition, in one embodiment, the present invention relates to fa method for die produc- 
tion of fanga) transformants comprising the introduction of the polynucleotide or the vec- 
tor of the present invention into the genome of said fungal cell ; 

For the expression of the nucleic acid molecules according to the invention in sense or 
antisense orientation in plant cells, the molecules are placed under the control of regula- 
tory elements which ensure the expression in fungal cells. These tegulatory dements may 
be heterologous or homologous with respect to the nucleic acid molecule to be expressed 
as well with respect to the fungal spedes to be transformed. 

In general, such regulatory elements comprise a promoter active in fungal cells. To obtain 
constitutive expression in fungal cells, preferably constitutive promoters are used, e.g^ the 
glyceraldehyde-3-dehydrogenase promoter 'derived from P. rhodozyma (WO 97/23,633). 
Inducible promoters may be used in order to be able to exactly control expression. An 
example for inducible promoters is the promoter of genes encoding heat shock proteins. 
Also an amylase gene promoter which is a candidate for such inducible promoters has 
been described (EP 1,035,206). The regulatory elements may further comprise transcrip- 
tional and/or translational enhancers functional in fungal cells. Furthermore, the regula- 
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selection (e-g., cells that have incorporated .the selectable marker gene will survive, while 
the other cells die) . i i 

■ ' . > i 

To create a homologous recombinant microorganism, a vector is prepared which contains 

at least a portion of the polynucleotide of the present invention into which a deletion, 
addition or substitution has been introduced to thereby alter, e.gj, functionally disrupt, the 
acetyl-CoA carboxylase gene. Preferably, this acetyl-CoA carboxylase gene is a P. rhodo- 
zyma acetyl-CoA carboxylase gene, but it am be a homologue from a related or different 
source. Alternatively, the vector can be designed such that, upon' homologous recombina- 
tion, the endogenous acetyl-CoA carboxylase gene is mutated or otherwise altered but still 
encodes a functional protein (e.g., the.upstream regulatory region can be altered to thereby 
alter the expression of the endogenous acetyi-CoA carboxylase). -To create a point muta- 
tion via homologous recombination also DNA-RNA hybrids can 5 be used known as 
chimeraplasty known from Cole-Strauss etal, Nud. Aci. Res., 2% 5, 1323-1330, 1999 and 
Kmiec, Gene therapy., American Scientist JJ7, 3, 240-247. 1999. \ 

The vector is introduced into a cell and ceBjs in which the introduced polynucleotide gene 
has homologously recombined with the endogenous acetyl-CoA carboxylase gene are 
selected, using art-known techniques.! • ! 

if' \ ' 

Further host cells can be produced which contain selection systems which allow for regula- 
ted expression of the introduced gene; For : example, inclusion o£the polynucleotide of the 
inventTon on a vector placing it under: control of the lac operon permits expression of the 
polynucleotide only in the presence ofBPTG. Such regulatory systems are well known in 

the art. J 

! ? \ 

Preferably, the introduced nucleic acid molecule is foreign to thehost cell. 

} 

• : . \ i 
By "foreign" it is meant that the nucleic acid molecule is either heterologous with, respect 

to the host cell, this means derived frtfrn a cell or organism with a different genomic back- 
ground, or is homologous with respect to the host cell but located in a different genomic 
environment man the naturally occurring counterpart of said nucleic add molecule. This 
means that, if the nucleic add molecule is homologous with respect to the host cell, it is 
not located in its natural location in the genome of said host cell] in particular it is sur- 
rounded by different genes. In this case the nucleic add molecule may be either under the 
control of its own promoter or under the control of a heterologous promoter. The vector 
or nudeic add molecule according to the invention which is present in the host cell may 
dther be integrated into the genome of the host cell or it may be maintained in some form 
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tory elements may include transcription termination signals, sudi as a poly-A signal, 
which lead to the addition of a poly A tail to the transcript which may improve its stability. 

Methods for the introduction of foreign DNA into fungal cells are also well known in the 
art. These include, for example, transifonrjation with the LiCl method, the fusion of proto- 
plasts, electroporation, biolistic methdds lie particle bombardment other methods known 
in the art. Methods for the transformationWng biolistic methods are well known to the 
person skilled in the art ' * 



3 



The term "transformation" as used herein, refers to the transfer of an exogenous poly- 
nucleotide into a host cell, irrespective of the method used for the transfer. The poly- 
nucleotide may be transiently or stably introduced into the host tell and may be main- 
tained non- integrated, for example, as a plasmid or as chimeric links, or alternatively, may 
be integrated into the host genome. ' 

i I > 
In general, the fungi which can be modified according to the invention and which either 
show overexpression of a protein according to the invention or i, reduction of the synthesis 
of such a protein can be derived from any desired fungal species.; 

i 

Further, in one embodiment, the present invention relates to a fungal cell comprising the 
polynucleotide the vector or obtainable by the method of the present invention. 

Thus, the present invention relates also to transgenic fungal cells Which contain (preferably 
stably integrated into the genome) a polynucleotide according tojthe invention linked to 
regulatory elements which allow expression of the polynucleotide in fungal cells and 
wherein the polynucleotide is foreign to the transformed fungal ilL For the meaning of 
foreign; see supra. , . I 

Thos, the present invention also relates to transformed fungal cells according to the inven- 
tion. • ; / 

i i 

Accordingly, due to the altered expression of acetyl-CoA carboxylase, cells metabolic path- 
ways are modulated in yield production, and/or efficiency of production. 

The terms "production" or "productivity" are art-recognized andfcdude the concentration 
of the fermentation product (for example fatty adds, carotenoids, (polysaccharides, 
lipids, vitamins, isoprenoids, wax esters, and/or polymers like polyhydroxyalkanoates 
and/or its metabolism products or further desired fine chemical as mentioned herein) 
formed within a given time and a given fermentation volume (e.g r ., kg product per hour 
per liter). 
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The term "efficiency" of production includes the time required for a particular level of pro- 
duction to be achieved (for example, how long it takes for the cell to attain a particular rate 
of output of a said altered yield, in particular, into carotenoids, (polysaccharides, lipids, 
vitamins, isoprenoids etc.). 

■ • 

i 

The term "yield" or "product/carbon yield" is art-recognized and includes the efficiency of 
the conversion of the carbon source into the product (Le. acetyl CoA, fatty acids, vitamins, 
carotenoids, isoprenoids, lipids etc. and/or •further compounds a? defined above and 
which biosynthesis is based on said prbduets). This is generally written as, for example, kg 
product per kg carbon source. By increasing the yield or production of the compound, the 
quantity of recovered molecules, or of useful recovered molecules of that compound in a 
given amount of culture over a given amount of time is increased. 

The terms "biosynthesis" (which is used synonymously for "synthesis" of "biological pro- 
duction" in cells, tissues plants, etc) or a "blosynthetic pathway" are art-recognized and 
include the synthesis of a compound, preferably an organic compound, by a cell from 

intermediate compounds in what maybe ajmultistep and highly regulated process. 

; : 

The language "metabolism" is art-recognized and includes the totality of the biochemical 
reactions that take place in an organism. Tjhe metabolism of a particular compound, then, 
(e.g., the metabolism of acetyl CoA, a fatty acid, hexose, isoprenoid, vitamin, carotenoid, 
lipid etc) comprises the overall biosynthetit, modification, and degradation pathways in 
the cell related to this compound. ; ' s 

Such a genetically engineered P. rhodozymd would be cultivated in an appropriate medium 
and evaluated in its productivity of carotenoids, especially astaxahthin. A hyper producer 
of astaxanthin thus selected would be confirmed in view of the relationship between its 
productivity and the level of gene or protein expression which is introduced by such a ' 
genetic engineering method. ; j 

The present invention is further illustrated with Examples described below. 

The following materials and methods employed in the Examples are described below; 

Strains 

P. rhodozvmaATCC96594 (re-deposited wider the accession No.- ATCC 74438 on April 8, 
1998 pursuant to the Budapest Treaty) 

R coli DH50K F, <t>80d, lac2AU\5 t A(JacZYA-W;gF)U169, hsd (jtk ; iuk*), rccAl, endA.1, 
deoR, thi-l t swpE44, gyr A96, reiki (Toyobo; Osaka, Japan) 
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Isolation of total RNA from P. rhodozyma was performed with the phenol method by using 
Isogen (Nippon Gene, Toyama, Japan). niRNA was purified froin total RNA thus ob- 
tained by using mRNA separation ldt (Clontech). cDNA was synthesized by using 
CapFinder cDNA construction kit (Clontech). 
5 In vitro packaging was performed by using. Gigapaclc III gold packaging extract (Strata- 
gene). ! ) 
The polymerase chain reaction (PCR) : was performed with the thermal cycler from Perkin 
Elmer model 2400. Each PCR condition is described in examples. PCR primers were pur- 
chased from a commercial supplier. Fluorescent DNA primers for DNA sequencing were 
10 purchased from Pharmacia. DNA sequencing was performed with the automated fluores- 
cent DNA sequencer (ALFred, Pharmacia), * . , 
Competent cells of DH5a were purchased from Toyobo (Japan).; 

Example 1: Isolation of mRNA from P. rhodozyma and construction of cDNA library 

» 

To construct cDNA library of P. rhodozyma* total RNA was isolated by phenol extraction 
15 method right after the cell disruption and die mRNA from P. rhodozyma ATCC96594 
strain was purified by using mRNA separation kit (Clontech) * 

At first, Cells of ATCC96594 strain from 10 ml of two-day-cuWe in YPD medium were 
harvested by centrifiigation ( 1500 x g for l6 min.) and washed oAce with extraction buffer 
(10 mM Na-citrate / HCl (pH 6.2) containing 0.7 M ICQ). After Suspending in 2.5 ml of 
20 extraction buffer, the cells were disrupted by French press homogenizer (Ohtake Works 
Corp., Tokyo, Japan) at 1500 kgf?cm2 and immediately mixed with two times of volume of 
isogen (Nippon gene) according to the method specified by the manufacturer. In this step, 
400 |ig of total RNA was recovered. 

Then, this total RNA was purified by u$ing)nRNA separation kit (Clontech) according to 

* » 

25 the method specified by the manufacturer ; Finally, 16 jig of mRNA from P. rhodozyma 
ATCC96594 strain was obtained. I 
To construct cDNA library, CapFinder PGR cDNA construction kit (Clontech) was used 
according to the method specified by the manufacturer. One jig of purified mRNA was 
applied for a first strand synthesis followed by PCR amplifications After this amplification 

30 by PCR, lmg of cDNA pool was obtained. . 

Example 2: Cloning of a partial ACQ (acetyl-Co A carboxylase j gene from P. rhodozyma 
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& . CQljTa,l-mHe t mF i A(mcrA)m> A(wcfCB^SMR-mrr)l73- endAl, supVM, thi-l> 
recAl, gyrA96,relAl, lac [F>oAB, fccIqZAM15, TnlO (tet*)) (Stratagene, La JoUa, USA) 
£ coli SOW eU-imcrk), A(mcrCB-hfdSMR-mrr)m> sbcC, recE, recj, umuC TnSCkan'), 
uvrC, tac,gyrA96, relAl, fAi-1, enrfAl^AR, [FproAB, laclqZ AM15] Su-(nonsuppressing) 
5 (Stratagene) •' 

& cfl/f TOP1Q ; F-, mcrA, Amrr-/w<^S-mcrBC;, AfacZ MI5; AlacX.74, recAl, deoR, 

or«D139, («r fl -Ze tt )7697, *rfU, jrfK, rpsL (Str r ), endAl, nupG (InVitrogen, Carlsbad, USA) 

« 

Vectors "j 

XZAPII (Stratagene) y •- - : J .". 

10 pBIuescriptllKS- (Stratagene) ! ] 

pMOSBlue T-vector (Amersham, Buckinghamshire, U.K.) 
pCR2.1-TQPO (Invitrogen) . 

'. 

Media ■ ' 

P. rhodozyma strain was maintained routinely in YPD medium (DIFCO, Detroit, U.SA.). 
15 E coli strain was maintained in LB mekiuni (10 g Bacto-trypton,£ g yeast extract (DIFCO) 
and 5 g Nad per liter). NZY medium (5gNaCUgMgS0 4 -7H 2 0, 5 g yeast extract 
(DIFCO), 10 g NZ amine type A (WAKO, Osaka, Japan) per liter) is used for X phage pro- 
pagation in a soft agar (0.7 % agar (WAKO)). When an agar medium was prepared, 1.5 % 
of agar (WAKO) was supplemented. • 'j 

20 Methods 

Restriction enzymes and T4DNA ligase were purchased from Takara Shuzo (Ohtsu, 
Japan). • f 

■ * 

Isolation of a chromosomal DNA from P. rhodozyma was performed by using QIAGEN 
Genomic Kit (QIAGEN, Hilden, Germany)! following the protocol supplied by the manu- 
facturer. Mini-prep of plasmid DNA from transformed E coli wis performed with the 
Automatic DNA isolation system (PI-Ip, Kurabo, Ox Ltd., Osaka, Japan). Midi-prep of 
plasmid DNA from an E coli transformant was performed by usu>g QIAGEN column 
(QIAGEN). Isolation of X DNA was performed by Wizard lambda preps DNA 
purification system (Promega, Madiso ji, USA.) following the protocol prepared by the 
manufacturer. A DNA fragment was isolated and purified from agarose by using 
QlAquick or QIAEX H (QIAGEN). Manipulation of X phage derivatives was followed by 
the protocol prepared by the manufacturer (Stratagene). 



25 
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To clone a partial ACC gene from P. rhodozyma, a degenerate PCR method was exploited. 

Species and accession number to database whose sequence for aqetyl-CoA carboxylase 

were used for multiple alignment analysis are as follows. 

Arabidopsis thaliana D34630 (DDBJ) 

5 Emericetlanidulans Y15996" (EMBL) 

Gattusgallus P11029 (Swiss-Prot) i 

Glycine max L48995 (GenBank) 

Homo sapiens $41121 (PIR) 

Medicagosativa L25042 (GenBank) ; 

Ovisaries Q28559 (Swiss-Prot) 

Rattus norvegicus PI 1497 (Swiss-Prot) 

Saccharomyees cerevisiae Q00955 (Swiss-Prot) • 

Schizosacckaromycespombe P78820 (SWiss-Prot) 
Ustilagomaydis S49991 (PIR) 

Two mixed primers whose nucleotide sequences were designed ajid synthesized based on 
the common sequence of known acetyl-CoA carboxylase genes frbm other species: acc9 
(sense primer) (SEQ ID NO:4) and acel3 (entisense primer) (SE& ID NO:5) (in the 
sequences V means nucleotides a, c, g or t; "h" means nucleotides a, c or t, M m" means 
nucleotides a or c, "k" means nucleotides g or t, and "y" means nucleotides c or t). 
After the PCR reaction of 25 cycles of-95°C;for 30 seconds, 45°C for 30 seconds and 72°C 
for 15 seconds by using ExTaq (Takara Shiuto) as a DNA polymerase and cDNA pool ob- 
tained in Example 1 as a template, reason mixture was applied to agarose gel electro- 
phoresis. One PCR band that had a desired length (0.8 kb) was recovered from the agarose 
gel and purified by QUquick (QIAGEN) according to the method by the manufacturer 
and then ligated to pMOSBlue-T-veetor (Ainersham). After transformation of competent 
E coli DH5CC, 6 white colonies were selected and plasmids were isolated with Automatic 
DNA isolation system. As a result of sequencing, it was found that 3 clones had a sequence 
whose deduced amino acid sequence was similar to known acetyl-CoA carboxylase genes. 
These isolated cDNA clones were designated as pACC1014 and used for further screening 
study. 

Example 3: IsokoonofgenomicDNAfromP.i-fco^a^wa 

To isolate a genomic DNA from P. rhodozyma, QIAGEN genomic kit was used according 
to the method specified by the manufacturer. 
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At first, cells of P. rhodazyma ATCC96594 strain from 100 ml of. overnight culture in YPD 
medium were harvested by centrirogation ( 1500 x g for 10min.) ; and washed once with TE 
buffer (10 mM Tris / HC1 (pH 8.0) containing 1 mM EDTA). After suspending in 8 ml of 
Yl buffer of the QIAGEN genomic kit, lyticase (SIGMA, St Louis, U.SA.) was added at 
5 the concentration of 2 mg/ml to disrupt cells by enzymatic degradation and the reaction 
mixture was incubated for 90 min at 30*C and then proceeded to the next extraction step. 
Finally, 20 ug of genomic DNA was obtained. 

Example 4: Southern blot hybridization by using pACC1014 as a probe 

Southern blot hybridization was performed to clone a genomic fragment which contains 
10 ACC gene from P. rhodozyma. Two jig of genomic DNA was digested by EcdRI and sub- 
jected to agarose gel electrophoresis followed by acidic and alkali pe treatment. The de- 
natured DNA was transferred to nylon membrane (Hybond N+, Amersham) by using 
transblot (Joto Rika, Tokyo, Japan) for an hour. The DNA which was transferred to nylon 
membrane was fixed by a heat treatment (fi0°C 90 min). A protie was prepared by label- 
15 ing a template DNA (JEa>RI and Satl -digested pACCl014) with DIG muitipriming method 
(Boehringer Mannheim). Hybridization was performed with the method specified by the 
manufacturer. As a result, a hybridized band was visualized in the range from 2.0 to 2.3 
kilobases (kb). 

- i 

Example 5: Cloning of a genomic fragment containing the ACC gene 

20 4 Jig of the genomic DNA were digested by 'EwRI and subjected to agarose gel electro- 
phoresis. Then, DNAs with a length within the range from 1.5 to 2.7 kb was recovered by 
QlAEXn gel extraction kit (QIAGEN) according to the method specified by the manufac- 
turer. The purified DNA was ligated to 0.5 ug of EcoRI-digested and CIAP. (calf intestine 
alkaline phosphatase)-treatedXZAP II (Stratagene) at 16°C overnight, and packaged by 

25 Gigapack III gold packaging extract (Stratagene). The packaged extract was infected to £ 
coU MRF strain and over-laid with NZY medium poured onto LB agar medium. About 
5000 plaques were screened by using 22coRI.and Sail-digested pACC1014 as a probe. Five 
plaques were hybridized to the labeled probe. ' 
The in vivo excision protocol was applied to these XZAP II derivatives containing putative 

30 ACC gene from P. rhodozyma by following the instruction manual (Stratagene) to done 
the insert fragment into K coli cloning, vector, pBluescript SK. Each done recovered from 
five positive plaques was subjected for sequencing analysis and it was found mat the three 
of them had the identical sequence to the insert fragment of pACC1014. One of the clone 
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revealed that this done contained 5' fragment of ACC gene as a result of BLAST X analysis. 
This clone was named as pACCPvul26 and used for further study. 

v 

Example 10: Southern blot hybridization by using pACCPvul26 as a probe 

Southern blot hybridization was performed to clone a genomic fragment which covered 5' 
5 end of ACC gene from P. rhodozyma. in a similar manner as Example 7, Southern blot 
hybridization was performed. A probe was prepared by labeling a template DNA (EcoBI - 
digested P ACCPvull6) with DIG multipriming method (Boehringer Mannheim). Hybri- 
dization was performed with the method specified by the manufacturer. As a result, a 
hybridized band whose size was dose to 5.0 kb was visualized. 

Example 11: Cloning of the genomic done covering 5* end of ACC gene 

i 

In a similar manner to Example 8, the genomic fragment containing the insert fragment in 
pACCPvul26 was doned by plaque hybridization. The genomic ;|ibrary covering 2.7 to 6.0 
Kb in length prepared in Example 8 was also used Twelve positive plaques which hybri- 
dized to the insert fragment of P ACCPvul26 labeled with DIG were isolated and subjected 
to in vivo excision to obtain plasmid DNA. : As a result of sequencing for thus isolated 
plasmids, most of the plasmids had the identical sequence to the insert fragment of 
pACCPvul26. One of the dones was named as pAGC204 and used for farther study. 

Example 12: Cloning of the gapped region between pACC204 and pACC127-17-0.9 

As a result of BUST X analysis against knoWn acetyl-CoA carboxylase genes succeeding to 
the sequencing study of 3' end of the insert fragment in pACC204 and 5* end of the insert 
fragment in P ACC127-1 7-0.9, it was suggested that an approximately 0.3 kb fragment 
coddbestMinissmgfora^^ The jbllowingPCR primers 

were synthesized based on the internal sequence of pACC204 andpACC127-17-0.9: acc43 
(sense primer) (SEQ ID NO:9) and acc44 (antisense primer) (SEQ ID NO:10). 
After thePCR reaction of25 cycles of94°C for 15seconds, 55'C for 30 seconds and 72°C 
for 15 seconds by using HF polymerase (Clontech) as a DNA polymerase and a genomic 
DNA obtained in Example 3 as a template, the reaction mixture was applied to agarose gd 
dectrophoresis. One PCRband that had a desired length (0.3 kbj was recovered from the 
agarose gd and purified by QIAquick (QIAGEN) according to the method by the manu- 
facturer and then doned into pCR2. 1-TOPO (Invitrogen). After transformation of com- 
petentE. coti TOP10, 6 white colonies Were sdected and plasmids were isolated with Auto- 
matic DNA isolation system. As a result of sequencing, it was fouhd that 5 clones had an 
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DNA was digested by EcoKL and subjected to agarose gel electrophoresis. Then, DNAs 
with a length within the following range were recovered by QIAEX II gel extraction kit 
(QIAGEN) according to the method specified by the manufacturer; (1) from 2.7 to 5.0 kb; 
(2) from 1.4 to 2.7 kbj and (3) from 0.5 to 1.4 kb. i . 

Each purified DNA was ligated to 0.5 |ig of £a>RI-digested and CIAP (calf intestine alka- 
line phosphatase)-treated XZAP U (Stratagene) at 16 °C overnight, and packaged by Giga- 
pack III gold packaging extract (Stratagene). The packaged extract was infected to E. coli 
MKF strain and over-laid with NZY medium poured onto LB agar medium. About 5000 
plaques were screened by using BcoRI ^-digested pACCStul07 and pACCPvdl07 as probes. 
The following candidates were isolated after plaque hybridization study. 

1) 3 plaques from the 2.7 to 6.0 kb library by using the insert of pACCPvdl07 as a probe. 

2) 3 plaques from the 1 .4 to 2.7 kb library by using the insert of pACCStul07 as a probe. 

3) 21 plaques from the 0.5 to 1.4 kb libraryby using the insert of pACCStul07 as a probe. 

7 

The in vivo excision protocol was applied to these XZAP II derivatives containing putative 
ACC gene from P. rhodozyma by following the instruction manual (Stratagene) to clone 
the insert fragment into R coli cloning vector, pBluescript SK. Eich clone recovered from 
the positive plaques was subjected for sequencing analysis. At least each done had the pu- 
tative ACC gene from BLAST X analysis (http://www.blast.genome,ad.rp/) . The following 
clones were selected and used for further analysis: * 
pACCH9-l8 having a 6 Kb insert and covering the 3' end of the ACC gene; 
pACC119-17-0.6 having a 0.6 kb insert flanking the 5* end of the;pACC1224 insert frag- 
ment; ; ; 
pACC119-17-2 having a 2 kb insert flanking the 5' end of the pACCl 19-17-0.6 insert frag- 
ment; and ; j ; 
pACC127-17-0.9 having a 0.9kb insert flanking the 5* end of the £ACCU9-17-2 insert 
fragment { 

As a result of whole sequencing of the entire region of insert fragment in pACCl 19-18> 
pACCl 19-17-0.6, pACC119-17-2 and pACC127-17-0.9, it was suggested that these clones 
did not cover the 5' end of the ACC gene. . ; 

Example 9: Cloning of the frankingregion of the insert fragment in pACC127- 17-0.9 
from the genome of P. rhodozyma by genome walking method 

PCR primer acc26 (SEQ ID NO:8) was' synthesized based on the internal sequence of 
pACC127-17-0.9 and used for genome walking method. s 
In the PGR reaction using acc26 primer, a 2.6 kb PCR band emerged from the genomic 
PvuH library. This PCR band was cloned into pCR2.1-TOPO (InVitrogen) and it was 
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was named as pACC1224 and used for further study. As a result of whole sequencing of 
the entire region of insert fragment in pACC1224, it was suggested that this clone con- 
tained neither its 5'- nor 3'-end of the ACC gene. 

Example 6: Cloning of the flanking region of the insert fragment in pACCl224 from 
the genome of P. rliodozyma by genome walking method 

\ j . 

Two PCR primers were synthesized based on the internal sequence of pACC1224 and used 
for the genome walking method: accl7 (SEQ ID NO:6) and acclfi (SEQ ID NO:7). 
The protocol of the instruction manual provided from the supplier (Clontech) was 
followed for the genome walking method. in the PCR reaction using accl7 primer, a 2.8 
kb PCR band emerged from the genomic Stui horary. In the casfe of accl8 primer, a 2.2 kb 
PCR band was produced in the genomic PyuU library. These PCR bands were cloned into 
pCR2.l-TOPO (Invitrogen) and it was revealed that 2.8 kb PCR band contained a 5' frag- 
ment of ACC gene and 2.2 kb PCR band contained 3' fragment of ACC gene, respectively. 
The clones containing 2.8 kb and 2.2 kb P(?R fragment were nanied as pACCStul07 and 
pACCPvdl07, respectively and used for further study. | 

Example 7: Southern blot hybridization by using pACCStu 107 and pACCPvdI07 as 
probes : • 

• . . I* 

Southern blot hybridization was performed to clone a genomic fragment which covered 
the ACC gene from P. rhpdozyma. 2 ug of genomic DNA was digested by Eco91 and sub- 
jected to agarose gel electrophoresis followed by acidic and alkaline treatment The de- 
natured DNA was transferred to nylon membrane (Hybond N+, Amersham) by using 
transblot (Joto Rika, Tokyo, Japan) for an hour. The DNA whicli was transferred to nylon 
membrane was fixed by a heat treatment (80*C, 90 minj. A probe was prepared by label- 
ing a template DNA (BcoRI -digested pACCStul07 and pACCPvdl07) with the DIG multi- 
priming method (Boehringer Mannheim). .^Hybridization was performed with the method 
specified by the manufacturer. As a result, several hybridized barids whose size was close 
to 2.0 kb, 0.9 kb and 0.6 kb were visualized when the insert fragment in pACCStul07 was 
used as a probe. In the case that the insert fragment in pACCPvdl07 was used as a probe, 
a hybridized band was visualized in the range from 6.0 kb to 6.5 kb. 

Example 8: Cloning of the genomic done covering the ACC gene 

In a similar manner to Example 5, the genomic fragment containing the insert fragment in 
pACCStuI07 and pACCPvd!07 was cloned by plaque hybridization. 4 ug of the genomic 
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Then, 3.1 kb of the Sad fragment containing ribosomal DNA (rDNA) locus (Wery et al., 
Gene, 184, 89-97, 1997) is inserted downstream of the G418 cassette on thus prepared 
plasmid. The rDNA fragment exists in multicopies on the chromosome of eukaryote. The 
integration event via the rDNA fragment would result in multicopied integration onto the 
5 chromosome of die host used and this enables the overexpression of foreign genes which 
are harbored in expression vector, • 

Subsequently, ACC promoter is inserted in the upstream of ACC terminator to construct 
of expression vector which functions in P. rhodozyma. 

Finally, the antisense ACC construct is completed by inserting the 1.5kb of Sfil fragment 
containing antisense ACC into thus prepared expression vector functioning in P. rhodo- 
zyma. A similar plasmid construction is disclosed in EP'l,f58,u51. * • 

Example 15: Transformation of P. rhodozyma with an ACC-antisense vector 

TheACC-antisense vector thus prepared is transformed into P. rhodozyma wild type strain, 
ATCC96594. The protocol for the biolistic;transformation is disclosed in EP 1,158,051. 

I 

Example 16: Characterization of antisense ACC recombinant of P. rhodozyma 

Antisense ACC recombinant of P. rhodozyma, ATCC96594 is cultured in 50 ml of YPD 
medium in 500 ml Erlenmeyer flask at 20 e C for 3 days by using their seed culture which 
grows in 10 ml of YPD medium in test-tubes (21 mm in diameter) at 20°C for 3 days. For 
analysis of carotenoid produced appropriate volume of culture bijoth is withdrawn and 
used for analysis of their growth, productivity of carotenoids, especially astaxanthin. For 
analysis of growth, optical density at 660 nm is measured by using a UV-1200 photometer 
(Shimadzu Corp., Kyoto, Japan) in addition to the determination of their dried cell mass 
by drying up the cells derived from 1 ml oforoth after imcrocentrifugation at 100°C for 
one day. For the analysis of the content of aitaxanthin and total carotenoids, cells are har- 
vested from 1.0 ml of broth after microcentrifugation and used for the extraction of the 
carotenoids from cells ofP. fhodozymaby disruption with glass beads. After extraction, 
disrupted cells are removed by centrifugation and the resultant is analyzed for carotenoid 
content with HPLC. The HPLC condition used is as follows: HPLC column: Chrompack 
Lichrosorb si-60 (4.6 mm, 250 mm), Temperature: room temperature, Eluent: acetone / 
hexane (18/82) add 1 ml/L of water to eluent, Injection volume: 10 Ul, Flow rate: 2.0 
ml/min, Detection: Tjv at 450 nm. A reference sample of astaxanthin can be obtained from 
Hoffmann La-Roche (Basel, Switzerland). = 



Best Available Copy 

-35- 

idcntical sequence from each other. One of the isolated clones was designated as 
pACCZlO. : 

Example 13: Sequencing of a complete genomic fragment containing ACC gene 

» • 

P ACC204; pACC210, pACC127-17-0.9, pACC119-17-2, pACC119-17-0.6, pACCl224 and 
5 pACC119-18 were sequenced with primer walking procedure by using AutoRead sequenc- 
ing kit (Pharmacia). 

As a result of sequencing, the nucleotide sequence comprising 10561 base pairs of the 
genomic fragment containing the ACC gene from P. rhodozyma containingits promoter 
(1445 base pairs) and terminator (1030 base pairs) was determined (SEQ ID NO:l). 

10 The coding region was 8086 base pairs long and consisted of 19 axons and 18 introns. In- 
trons were dispersed all through the coding region without 5' or 3' bias. It was found that 
an open reading frame (SEQ ID NO:2) consists of 2187 amino acids (SEQ ID NOi3) whose 
sequence is strikingly similar to the lcnoWamino acid sequence of acetyl-CoA carboxylase 
from other species (56.28% identity to acetyl-CoA carboxylase from BmericeUa nididans) 

15 as a result of homology search by GENETYX-SV7RC software (Software Development Co., 
Ltd., Tokyo, Japan). 

Fig. 1 depicts a cloned DNA fragment covering ACC gene region'on the chromosome of P. 
rhodozyma. * 

Example 14: Construction of antisense plasmid for ACC gene 

* * 

20 An antisense gene fragment which covers the entire structure gene for ACC gene is ampli- 
fied by PCR and then cloned into an integration vector in which jhe antisense ACC gene is 
transcribed by its own ACC promoter 'in P.- rhodozyma. < 

The primers include an asymmetrical recognition sequence for the restriction enzyme, Sfil 
(GGCCNNNNNGGCC) but their asymmetrical hang-over sequoace is designed to be 
25 different This enables a directional cloning into expression vector which has the same 
asymmetrical sequence at their ligation sequence. The use of such a construction is 
disclosed in HP 1,158,051. ! 

For the promoter and terminator fragment which can drive the transcription of the anti- 
sense ACC gene, the ACC promoter and terminator is cloned from the chromosome by 
30 using the sequence information listed in SEQ ID NO:l. The ACC terminator fragment is 
fused to a G418 resistant cassette by ligating the DNA fragment containing the ACC 
terminator to a G418 resistant cassette of pG418Sa330 (EP 1,035,206) to an appropriate 
vector such as pBIuescriptll KS- (Stratagene). 
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1. An isolated polynucleotide comprising a nucleic add molecule one or more selected 
from the group consisting of- 

(a) nucleic add molecules encoding at least the mature form of the polypeptide depicted in 
5 SEQIDNO-3; 

(b) nucleic acid molecules comprising the coding sequence as depicted in SEQ ID NO:2; 

(c) nuddc add molecules whose nudeotide sequence is degenerate as a result of the 
genetic code to a nudeotide sequence of (a) or (b); 

(d) nudeic add molecules encoding a polypeptide derived from the polypeptide encoded 
10 by a polynudeotide of (a) to (c) by way of substitution, deletion and/or addition of one or 

several amino adds of the amino add sequence of the polypeptide encoded by a nudeotide 
of (a) to (c); 

(e) nudeic add molecules encoding a polypeptide derived from the polypeptide whose 
sequence has an identity of 56.3 % or more to the amino add sequence of the polypeptide 

15 encoded by a nucldc add molecule of (a) or (b); 

(f) nucldc acid molecules comprising a fragment encoded by a nuddc add molecule of 
any one of (a) to (e) and having acetyl-CoA carboxylase activity; . 

(g) nudeic add molecules comprising a polynudeotide having a "sequence of a nudeic acid 
molecule amplified from a Phaffia nucleic add library using the primers depicted in SEQ 

20 IDNO:4,5,and6; 

(h) nudeic add molecules encoding a polypeptide having acetyl-poA carboxylase activity, 
wherein said polypeptide is a fragment of a;polypeptide encoded by any one of (a) to (g); 

(i) nudeic add molecules comprising at least 15 nudeotides of a polynudeotide of any one 
of (a) to (d); 

25 0) nudeic add molecules encoding a polypeptide having acetyl-CoA carboxylase activity, 
wherein said polypeptide is recognized by antibodies mat have been raised against a 
polypeptide encoded by a nudeic add molecule of any one of (a)" to (h); 
(k) nucleic add molecules obtainable by screening an appropriate library under stringent 
conditions with a probe having the sequence of the nudeic acid molecule of any one of (a) 

30 to and encoding a polypeptide having acetyl-CoA carboxylase activity; 

(I) nucleic add molecules whose complementary strand hybridizes under stringent 
conditions, with a nuddc add molecule of any one of (a) to (k), and encoding a 

i 

polypeptide having acetyl-CoA carboxylase activity. 

2. An isolated polynudeotide comprising a nudeic add moleculeone or more sdected 
35 from me group consisting of. 
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(m) nucleic acid molecules comprising the nucleotide sequence as depicted in SEQ ID 
NO:lj 

(n) nucleic acid molecules whose nucleotide sequence is degenerate as a result of the 

genetic code to a nucleotide sequence of (m); 
5. (o) nucleic add molecules encoding a polypeptide derived from f he polypeptide encoded 

by a polynucleotide of (m) or (n) by way of substitution, deletion and/or addition of one 

or several amino acids of the amino add sequence of the polypeptide encoded by a 

nudeotideof(m)or (n); • 5 

(p) nudeic acid molecules encoding a.polyjpeptide derived from the polypeptide whose 
10 sequence has an identity of 56.3 % or more to the amino add sequence of the polypeptide 

encoded by a nudeic add molecule of (m); * -- . • 

(q) nudeic add molecules comprising a fragment encoded by a nudeic acid molecule of 

any one of (m) to (p) and having acetytCdA carboxylase activity, 

(r) nuddc acid molecules comprising a polynudeotide having a sequence of a nuddc add 
15 molecule amplified from a Phaffia nucjdc add library using the primers depicted in SEQ 
IDNO:4,5,and6; • 

(s) nuddc add molecules encoding a polypeptide having acetyl-CoA carboxylase activity, 
wherein said polypeptide is a fragment of a polypeptide encoded by any one of (m) to (r)j 
(t) nudeic add molecules comprising at least 15 nudeotides of a polynudeotide of any one 

20 of(m)to(o); '■ 

(u) nudeic add molecules encoding a polypeptide having acetyl-CoA carboxylase activity, 
wherein said polypeptide is recognized by antibodies that have been raised against a 
polypeptide encoded by a nuddc add molecule of any one of (m) to (s); 
(v) nuddc add molecules obtainable by screening an appropriate library under stringent 

25 conditions with a probe having the sequence of the nuddc add molecule of any one of 
(m) to (u), and encoding a polypeptide having acetyl-CoA carboxylase activity; 
(w) nuddc add molecules whose complementary strand hybridizes under stringent 
conditions with a nuddc acid molecule of any one of (m) to (v), and encoding a 
polypeptide havmg acetyl-CoA carboxylase^ activity. 

30 3. The isolated polynudeotide of daiiri 1 or 2, wherein said polynucleotide encodes amino 
add sequence which is identified by SEQ ID NO: 3 or has identity of 56.3 % or more with 
SEQ ID NO: 3. 

4. The isolated polynudeotide of any one of claims 1 to 3, wherdn said polynudeotide is 
derived from a strain of P. rhodozyma or Xanthophytomyces dendrorhous. 
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5. A method for making a recombinant vector comprising inserting the polynucleotide of 
any one of claims 1 to 4 into a vector.' 

6. A recombinant vector containing the polynucleotide of any one of claims 1 to 4 or 
produced by the method of claim 5. 

t ' 

i 

5 7. The vector of claim 6 in which the polynucleotide of any one of claims 1 to 4 is 

operatively linked to expression contriol sequences allowing expression in prokaryotic or 
eukaryotic cells. ; 

8. A method of making a recombinant organism comprising introducing the vector of 
claim 6 or 7 into a host organism. 

o 9. The method of claim 8, wherein said host organism is selected from R colh baculovirus, 
or S. cerevisiae. 

10. The recombinant organism contaihing the vector of claim 6 or 7, or produced by the 
method of claim 8 or 9. 

1 1. A process for producing a polypeptide having acetyl-CoA carboxylase activity 
5 comprising culturing the recombinant organism of claim 10 andrecovering the 

polypeptide from the culture of said recombinant organism. 

12. Apolypeptideobtainablebytheprocessofdaimll. : 

13. An antibody that binds specifically to the polypeptide of daira 12. 

» « 

14. An antisense polynucleotide against the polynucleotide of any one of claims 1 to 4. 

o 15. A method for making a recombinant vector comprising inserting the polynucleotide of 
claim 14 into a vector. 

i i 

s 

16. A recombinant vector containing the polynucleotide of claim: 14 or produced by the 
method of claim 15. 

17. The vector of claim 16 in which the polynucleotide of claim 14 is operatively linked to 
5 expression control sequences allowing expression in prokaryotic or eukaryotic cells. 

18. A method of making a recombinant organism comprising introducing.the vector of 
claim 16 or 17 into a host organism. ' 
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SEQUENCE LISTXNG<110> 

Roche Vitamins AG 
<120> ACC gene 
<130> MDR5217 
<160s> 10 

<170> Patentln version 3.1 



<210> 

10 <211> 
<212> 
<213> 
«220> 
<221> 

15 <222> 
<223> 
«220> 
<221> 
<222> 

20 <223> 
<220> 
<221> 
<222> 
<223> 

25 <220> 
<221> 
<222> 
<223> 
<220> 

30 <221> 
<222> 
<223> 
<220> 
<221> 

35 <222> 
<223* 
<220> 
<221> 
<222> 

40 <223> 
<220> 
<221> 
<222> 
<223> 



1 

10561 
DNA 

Phaffia rhodozyma 
5'UTR 

(1221) . . (1222) 



polyA_site 
(9813) . . (9814) 



exon 

(1446) (1482) 



&XOA 

(9231) . . (9530) 



exon 

(7296).. (9160) 



exon 

(7048).. (7227) 



exon 

(6899)-. (6976) 



exon 

(5871).. (6832) 
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<220> 

<221> exon 

<222> (5674).. (5805) 

<223> 
5 <220> 

<221> exon 
<222> (5456) . . (5608) 
<223> 
•<:220:» 
10 <221> exon 

<222> (4984) . { (5384) 

<:223> 

«220> 

<221> exon 
15 <222> (4096) .. (4911) 
<223> 
<220> 

<221> exon 
<222> (3828) (4026) 
20 <223> 
<220> 

<221> exon 

<222> (3075) (3443) 

<223> 
25 «s220> 

<221> exon - 

<222> (3518) .. (3552) 

<J223> 

<220> 
30 <22l> exon 

<222> (1676).. (1758) 

<223> 
<220> 

«221> exon 
35 <222> (1833) ..(1957) 
<223> 
<220> 

<221> exon 
<222> (2031) . . (2171) 
40 <223> 
<220> 

<221> exon 

<222> (2244) .. (2641) 

<223> 
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19. The method of claim 18, wherein said host organism is belongs to a strain of Phaffia 
rh'odozyma or Xanthophybmyees dendforhous. 

20. The recombinant organism containingthe vector of claim 16 or 17, or produced by the 
method of claim 18 or 19. 

5 21. The recombinant organism of claim 20, wherein said organism is characterized in that 
whose gene expression of acetyl-CoA carboxylase is reduced compared to the host 
organism, thereby is capable of producing carotenoids in an enhanced level relative to a 
host organism. ; 

22. The recombinant organism according to claim 21, wherein the gene expression of 
10 acetyl-CoA carboxylase is reduced by means of the technique selected from antisense 

technology, site-directed mutagenesis; error prone PCR, or chemical mutagenesis. 

23. A process for producing carotenoids, which comprises cultivating the recombinant 
organism of claim 21. 

24. The process of claim 23, wherein said carotenoids are selected one or more from 
15 astaxanthm,0-carotene,lycopene,^^ f 

25. The process according to claim 23, wherein the gene expression of acetyl-CoA 
carboxylase is reduced in the recombinant organism of claim 21 by means of the technique 
selected from antisense technology, site-dir'ected mutagenesis, error prone PCR, or 
chemical mutagenesis. 



20 



26. A process for the production of a carotenoid by culturing a microorganism under suit- 
able conditions and, optionally, recovering the resulting carotenoid, wherein the micro- 
organism is characterized in that its gene expression of acetyl-CoA carboxylase is reduced, 
e-g. by means of the technique selected from antisense technology, site-directed muta- 
genesis, error prone PCR, or chemical mutagenesis. 
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10 



13 



20 



25 



30 



35 



40 



<220> 
<221> 
<222> 
<223> 
<220> 
<221> 
<222> 
<223> 
<22(fr 
<221> 
<222> 
<223> 
<220> 
<221> 
<:222> 
<223> 
<220> 
<221> 
<222> 
<223> 
<220> 
<22X> 
<222> 
<223> 
<220> 
<221> 
<222> 
<223> 
<220> 
<221> 
<222> 
<223> 
<220> 
<221> 
<222> 
<223> 
«220> 
<221> 
<222> 
<223> 
<220> 
<22}> 
<222> 
<223> 



eacon 

(2746) . . (2991) 



exon 

(3626) . . (3750) 



Incron 

(1483) ..(1675) 



Intron 

(4912) . . (4963) 



intron 

(3751) . . (3827) 



Intron 

(5385) .-(5455) 



Intron 

(6977).. (7047) 



Intron 

(3553) ..(3625) 



Intron 

(5806) . .(5870) 



intron 

(9161) . . (9230) 



Intron 

(3444) .. (3517) 
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3 

caccetgtct ttgtctcttt gcctttcgtt ciatctagcg ctgttcaacg gatcactcag 300 

r t 

tcggcttgac tcaactccct ctggaacgtg tgccttatct caggttctga tttctcctca 360 
gccagtatgc gcacaaagca gcgatcgtga ctttttgcec cataagaect ctcagcgggg 420 

i 

aatatatgac acccatacat cgatagctcg tatgttttct ttgatcactt cctaaaatgt 



aecgtecttc tatacagccc agtcagctcg tcgacctcac ataaagtgac tgagaccgcg 



acaag atg gtt gtc gat cac gag age gta agg cat ttc ate g 
Met Val Val Asp His Glu Ser tfal Arg His Phe life 
1 5 i 10 



480 



10 aaeggcaact gacattcaac atgatgeget ttcatagacc aactacttzcc gactacgatg 540 



600 



atctcgaaca tcttattcct tccaccgtia gctgagaagt ggattacacc atcaatagaa 660 

15 • ; 

tcatctaccc cgttcttgcc tggactaaeg cgtcaggage tcttggatjaa aggagaaata 720 



gctgagcaga ccatcacctt ggatgatgtc cgtctgtggc tgaactcegg aggtcgagtg 780 

» * v 

; * * 

20 gcgtgctgca acgcacttcg aggaatttgg gaagtgaacc tcgtttggag tgataaatga 840 

gattacgaaa gtctgttcga aacatccadg ctteatgata accgataaeg cttaaatctt 900 

gagagtgege acatcgatcg ccttttatat acggggttgg ggaaacatjaa agtgttcata 960 

; ■ ; 

gactattgtt catatatctt aaagtacaaa gaegcatcta accctaagcc tgaatgatrg 1020 

gcaaaatcct agtaagaccg cgaaatccqg aagaataege agttcattaa taaagatata 1080 

30 gcttaggtaa gcagcggttg ctcccccaHc caacctcatc cgaaattccc cagggggttg 1140 
■ » > 

agattctcaa ggctttgaat ccccatcccg tqaagttggt cttaaaccct tcatctctac 1200 

■ ; i 

ttgttacttc ttttcttcct gacctccttc cdecaeteee tectattckc tgaacgaact 1260 

35 ; ; : 

cgcctccetg tccatctact ettctteggt tttcttttgg gtttttactt ctctcgttcc 1320 

tccfcccatct ttccatctct tttegtatet gtgggtaact ttgcanccaa gggccctcac 1380 

40 acataaccct atatccatct tcetecattc aeaeacatet gtactcaacc aacaaagctc 1440 



1482 
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10 



srtaagcgttc tcgeccetct ccttgcctgg ctccctgcat tttcttaaiac gatctaggaa 

5 

gagagggaaa ttacatctgg tcaattttcc gcgctctttt ccttggggac aaaagaatgc 

« 

ctttctgtga tcggagatcg gttgctgatc tcttttgtct tgttcttt,tt gct'ctttccc 

* » 

tccectttae cag gt gga aac gca cttigag aac gcc cct ccg tea age 
Gly Gly Asn Ala LeufGlu Asn Ala Pro Pro Ser Ser 



1542 
1602 
1662 
1710 



15 



20 



15 



20 



25 



gtc acc gat ttc gtt aga agt caa gat ggt cac acg gtc 'ate acc aaa 1758 
val Thr Asp Phe val Arg Ser Gin Asp Gly His Thr Val He Thr Lys 
25 30 . ; 35 40 

\ 
i 

gteagtaatt ttcatttttt ccctcacgta gcctcagggc caaggagcta aattgettet 1818 

geaecacccc tcag gtc etc att gcc? aac aac gga ate get get gta aaa 186B 
Val Leu He Ala Asn Asn Gly He Ala Ala Val Lys 

gag acc cga tea gtc cgt aaa egg get cac gag acg ttt gga gat gag 1916 
Glu He Arg Ser Val Arg Lys Trp Ala Tyr Glu Thr Phe Gly Asp Glu 
55 60 : 65 ; 

cga gee ate gaa ttt acg gta atg gcc act cca gaa gat tt 1957 
Arg Ala He Glu Phe Thr val Mee Ala Thr Pro Glu Asp pen 
70 75 i 80 •' 

i • * 

30 gttegtacca atcacataag ctttccttga gtcagggaca tcctctaatt aattcaactt 2017 

I 

gagegecata cag g aag gtg aac tgc gac tat att cga atgi get gat cga 2067 
Lys Val Asn Cys Asp Tyr He Arg Met Ala Asp Arg 
.. 85 • 90 
35 gtc gcc gaa gcc ccc gga gga act aac aac aac aat cac tet aac gtc 2115 
Val Val Glu Val Pro Gly Gly Thr Asn Asn Asn Asn His Ser Asn Val 
55 100 ! 105 no 

gac etc acc gcc gac acc gcc gag cga ccc aac aca cac gcc gcc tgg 2163 
40 Asp Leu lie Val Asp lie Ala Glu Arg Phe Asn lie His Ala Val Trp 

H5 = 120 125 

get gga eg gtaagtaaaa taggacctta acatgtcgga agaagagegc 2211 
Ala Gly Trp ' : 
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ccacctaaac gcgctttctt tccatccgac ag g ggt cac get teg gaa aac cce 

Oly His Ala Ser Glu Agn Pro 
130 135 



2265 



10 



aga ctt cce gag tct etc gee gee tea aag aac aag ate ;gtc ttc att 2313 
Arg Leu Pro Glu Ser Leu Ala Ala Ser Lys Asn Lys lie ;Val Phe lie 
140 I 145 1 .150 

t 

ggt cct cce gga tee get atg cg4 tec ctt gga gac aag [att tct teg 2361 
Gly Pro Pro Gly Ser Ala Met Arg Sefc Leu Gly Asp Lys : Ile Ser Ser 
155 160 165 I 
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Thr He Val 
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gee cag tct 
Ala 61a Ser 



gec cag gt$ ccg tgt atg gec ]tgg tct gga 
Ala Gin Vai Pro Cys Met Ala Trp Ser Gly 

: i 180 { 
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tea ggc ate 
20 Ser Gly He 
185 



act gat aca 
Thr Asp Thr 
190 



gag cti age 
Glu Leu Sel- 



ect cag ggc ttc gtg act gtg 
Pro Gin Gly Phe Val Thr Val 

195 v 200 
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cce gat ggg 
Pro Asp Gly 



ttg gtg cga 
Leu Val Arg 



gag gga gga 
Glu Gly Gly 
235 

ttc aag aac 
Phe Lys Asn 
250 . 



eea tat cag 
Prp Tyr Gin 
205 

gec gag aag 
Ala Glu Lys 
220 

gga gga aag 
Gly Gly Lys 

tec tac aae 
Ser Tyr Asn 



get get tgt gta aag acg gtg gag gat ggt 
Ala Ala Cys Val Lys Thr Val Glu Asp Gly 
210 215 
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ate ggt tt£ 
He Gly Leii 
225 

» 

ggt ate cga 
Gly He Arg 
240 ; 
tec gtc get 
Ser Va^ Alai 
255 : 



cca gtt atg ate aag gee tct 
Pro Val Met He Lys Ala Ser 
230 

atg gtt cac age ktg gac aca 
Met Val His Ser Met Asp Thr 

245 : . 

tec gag gtg cca g gtaagttcac 
Ser Glu Val Pro : 

s 

260 
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tctgtttgac tggagatttg agcaeaatct ct^aecatggg agttcaagaa ggaataccca 2711 

40 ctcatgaatt gaegactgeg ttcttgacqt cljag ga tct ccg att ttc ate atg 2765 

Gly Ser Pro lie Phe lie Met 
: . 265 



gec ttg get gga tct get cga cat ttg gag gtc cag etc ctt get gat 
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cga ate acg age gaa aac ccc gat gag ggg ttc aag ccg tct gec gga 3397 
Arg He Thr Ser Glu Asn Pro Asp Glu Gly Phe Lys Pro Ser Ala Gly 
440 I 445 -450 
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5 gat ate caa gag ttg aac ttc aga agt aat act aac gtc tgg gga t 3443 
Asp. lie Gin Glu Leu Asn Phe Arg Ser Asn Thr Asn val Trp Gly 
455 460 4^5 
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gtgagtacag aggcttctca aagattctta tgcggaacaa atctctgact cttaaattgt 3503 

» 

gtttgacttt caag ac ttc tct gtt gga get act gga gga att cat agt 3552 
Tyr Phe Ser val Gly Ala Thr Gly Gly He His Ser 

47<j 475 1 

> 

15 gtaagtttct tcgccaacaa tataatcaca cfcagatccct atctaatctg aactggctta 3612 

I , 

i 

tctcttgtta tag ttc gec gat tct caa ttc ggt cac gtg tct get tat * 3661 
Phe Ala Asp Ser .Gin : Phe Gly His Val Phe Ala Tyr 
480 ' 485 ' 490 

; i 

20 . : 

gge tec gac cga acg act gec aga aag aat atg gtt ate jyce ttg aaa 3709 

Gly Ser Asp Arg Thr Thr Ala Arg Lys Asn Met Val lie Ala Leu Lys 
495 1 500 i 505 

1 ! i 

25 gag ctt tec att cga gga gac etc cga acc act gtc gag ta 3750 
Glu Leu Ser He Arg Gly Asp Phe Arg Thr Thr Val Glu Tyr 

510 .519 

: j 1 

t 

• t 

gtgegtatag cctggtacat crtcctttcaa tcjacttacga tgaaetgapc gatctgtctc 3810 

30 J i 

\ ; 
gatcacgttt aatctag t ctt ate act ctt ctt gag acg agejgat ttc gag 3861 

Leu lie Thr Leu Leu Glu Thr. Ser.. Asp Phe Glu 
•i 525 5 530 

35 

* t 

cag aac gec att acc acc get tgg ttg gat ggg ttg ate act aac aag 3909 
Gin Asn Ala lie Thr Thr Ala Trp Leu Asp Gly Leu He Thr Asn Lys 

535 B 540' 545 

* 

40 ctt aca tct gag agg cct gat cca tcsi ctg gec gtt att tgt ggt gca 3957 
Leu"Thr Ser Glu Arg Pro Asp Pro, Ser Leu Ala Val He Cya Gly Ala 
550 555. ( 560 ! 
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cga cga cat cag aag ate ate gag gag get ccc gtc acg &tc get cgt 
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305 \ 310 \ 315 
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cca gag aga ttc gaa gag atg gag aag get get gtc agg ptg gec aag 2957 
Pro Glu Arg Phe Glu Glu Met Glu Lys Ala Ala Val Arg £eu Ala Lys 
320 325 330 

tta gta gga tat gtt agt gec ggt acd gtc gaa t gtaaggaaca 3001 
Leu Val Gly Tyr Val Ser Ala Gly Thr Val Glu 
33S 340 \ 



20 aacagctacc tctcattctg ttttttcgag atagtcaact tacatcactt ttettttgee 

« 

V 

ggattttctt tag ac etc tac tct cac ;gcc gac gac tea ttc ttc ttc 
Tyr Leu Tyr Ser His : Ala Asp Asp Ser Phe Phe Phe 
345 ? 350 J 355 
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etc gaa etc aac cct cga ctt caa gtc gag cac cct act acc gag .atg 
Leu Glu Leu Asn Pro Arg Leu Gin Val Glu His Pro Thr Thr Glu Met 
360 : 365 \ 370 
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30 gtc teg ggt gtc aac ctt ccc get get cag ctt cag att get acg ggt 
Val Ser Gly Val Asn Leu Pro Ala Ala Gin Leu Gin He Ala Met Gly 

375 380 385 

ate cct ctt tct cga att egg gat att cga gtc etc tac ggt etc gat 
He Pro Leu Ser Arg He Arg Asp He Arg Val Leu Tyr Gly Leu Asp 

35 390 395 400 • 
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ccc cac act gtt tec gag ate gac ttc gac age age aga gcg gag tct 3301 
Pro His Thr Val Ser Glu He A6p Phe Asp Ser Ser Arg Ala Glu Ser 
405 410 ! 415 ; 

gtc cag act cag agg aag cct agg ccc aag ggt cac gtc att gec tgt 3349 
Val Gin Thr Gin Arg Lys Pro Arg Pro Lys Gly His Val lie Ala Cys 
420 425 i 430 435 
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att gtg aaa get cac gtg get tefc gag aac tgt tgg gec gaa tac cga 4005 
lie Val Lys Ala His Val Ala Ser Glu Ajm Cys Trp Ala Glu Tyr. Arg 

565 570 « ! 575 T 

* i 

; J ■ 

5 cga gta ttg gac aag gga cag gtaagctetg tttctcatga agtttttgac 4056 
Arg Val Leu Asp Lys Gly Gin - ' 

560> 585 1 \ 

tgaggcacte accaetccgt acatgtttcc tgtttttag gtt ccc tec aag gac 4110 
10 Val Pro Ser Lys Asp 

; 590 

; ; i 

act etc aag aca gtg ttc act ctt gal: tec ate tat gag ggt gtt egg 4158 
Thr Leu Lys Thr Val Phe Thr Leu As£ Phe lie Tyr Glu Gly Val Arg 
15 595 600 fc05 



20 



25 



35 



tac aat 
Tyr Asn 



eta aac 
Leu Asn 
625 

* 

gga atg 
Gly Met 
640 



ttc acc 
Phe Thr 
610 

gga gga 
Gly Gly. 



cga ttg tat 
Arg Leu Tyr 



gee gat ggt 

Ala Asp Gly 

\ 
\ 



etc gtt 
Leu val 



gag gaa 
30 Glu Glu 

att gag 
lie Glu 



get get cga gec teg etc aac act tac 
Ala Ala Arg Ala Ser Leu Asn Thr Tyr 
615 620 

aag acc gtg gtg tec ate cga cct ttg 
Lys Thr Val Val Ser He Arg Pro Leu 
630 635 

J 

ctt etc gat ggc cgsi tec cac act etc tac tgg agg 
Leu Leu Asp Gly Arg Ser Hie Thr Leu fryr Trp Arg 
645 \ 650 : 555 

; • \ 

act tgc ctg 
thr Cys Leu 
j 670 
ccg cct gga 
Ser Pro Gly 
685 
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aag ate 
Lys He 



gtc ggt ace etc cga att cag 
tfal Gly Thr Leu Arg He Gltt 
660 

cag gag aac gac ccc act cag 
Gin Glu Asn Asp Pro Thr Gin 

675 \ esq 

f 

ate egg ttt ttg gtc gaa age 
He Arg Phe Leu Val Glu Ser 
690 695 



gta gac gea aag 
val Asp Ala Lys 
665 

etc cga tea ccc 
Leu Arg Ser Pro 



gga gat cac ate 
Gly Asp His He 
700 



tec tec gga 
Ser Ser Gly 



4254 



4302 



4350 



4398 



4446 . 



40 gat ate tat get gag gtt gag gtc atg aag atg ate ttg ccc ttg att 
Asp He Tyr Ala Glu Val Glu Val Met Lys Met He Leu Pro Leu He 
705 710 715 
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etc cct aac ctg ccc ggt aac aga cct cac cag egg eta ;cag etc cag 
Leu Pro Asn Leu Pro Gly Asn Arg Pro His Gin Arg Leu -Gin Phe Gin 
15 770 775 780 ] 

• i 

: • * 
ctt gag teg ata tac teg gtc ttg gat gga tac gag agt ;gac tec act 
Leu Glu Ser He Tyr Ser Val Leu Asp Gly Tyr Glu Ser |Asp Ser Thr 
7B5 790 795 * 

; 

20 j 

gca aca ate etc cga tea ttc tct gaa aac ctt tat gat !cct gat ctt 
Ala Thr ile Leu Arg Ser Phe Ser Qlix Asn Leu Tyr Asp J?ro Asp Leu 
800 805 810 ) B15 

25 get ttc gga gag get tta tec ate att tec gtc ctt tct ~ggg aga atg 
Ala Phe Gly Qlu Ala Leu ser He lie Ser val Leu Ser :<51y Arg Met 
820 * S25 I 830 
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gec cag gag Ccc ggt cac gtt'cag ttt gtc aag caa gec -;ggt gtg ace " 
Ala Gin Qlu Ser Gly His Val Gin Phe Val Lys Gin Ala ,'Gly Val Thr 

720 725 : 730 - 735 
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gtc gat cct gga gcg att att ggg ate- ttg agt ctt gat >gae cct acg. 4590 
val Aap Pro Gly Ala lie lie Gly lie Leu Ser Leu Asp jAsp Pro Thr 
740 j 745 ; 750 

; j 

cga gtg aag aag gcg aag ccc ttc gag ggt etc ctg cct jgtg act ggt 4638 
Arg val Lys Lys Ala Lys Pro Phe Glu Gly Leu Leu Pro !Val Thr . Gly 
755 " 760 ;765 
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4734 



4782 



4830 



4878 



cct gec gat ctt gag gag age atfc cga gag gec ate age *gaa get cag 
Pro Ala Asp Leu Glu Glu Ser lie Arg Glu Val Ile Ser Glu Ala Gin 

835 • 840 "845 

teg aag cct cac gec gag ttc ccp gga tea aag gtgtgcagtt gategcagag 4931 
Ser Lys Pro His Ala Glu Phe Pro Gly Ser Lys \ 
8S0 855 

ttatgactgt atacatcgac cagaagctta cecatctctt tcgtgtgcac ag ate etc 4989 

: ? lie Leu 

i 860 

40 aaa gtc gtc gag egg tac ate gat aat rtg cga cct cag :gag agg get 5037 
Lys Val Val Glu Arg Tyr lie Asp Aen Leu Arg Pro Gin .Glu Arg Ala 
065 % 870 875 
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acg gtc cga act cag ate gaa ccc ate get ggt att get gag aag aac 5085 
Met Val Arg Thr Gin. lie Glu Pro- lie Val Gly He Ala Glu Lys Asn 
B80 i aas 890 



10 



gtt ggc ggt cct aag ggt tac gec tct tac gtc tta get acc ate ett 
Val Gly Gly Pro Lys Gly Tyr Ala Ser Tyr Val Leu Ala Thr He Leu 
895 900 905 [ 

caa aag ttc ttg gec gtt gag gee gtt ctt get acc ggt agt gaa gag 
Gin Lys Phe Leu Ala Val Glu Ala Val Phe Ala Thr Gly Ser Qlu Glu 

910 ' 915 I . 920 ', 



35 



40 



tct ttg aag get egg gaa att ctt ate tct tgc cct cW ccc tct 
Ser Leu Lys Ala Arg Glu He Leu ile Ser Cys Ser Leu Pro Ser 
1000 1005 1010 

tac gag gag agg ttg ttc cag atg &aa aag ate ctt aac tct tct 
Tyr Glu Glu Arg Leu Phe Gin Met Glu Lys He Leu Asn Ser Ser 
1015 1020 ; 1025 
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5181 
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5277 



gee att get* etc caa ctt ega gat ga* aac cga gaa tct ttg aac gac 
Ala Ile val Leu Gin Leu Arg Asp Glu Asn Arg Glu Ser Leu Asn Asp 

13 925 : WO : ■ : 935 I 940 

i \ 

gtc ctt ggt etc gtc ctg.gct ca ? tc$ cgt etc age get icga tec aag 
Val Leu Gly Lou Val Leu Ala His set Arg Leu Ser Ala Arg Ser Lys 

345 j 1 950 : 955 

20 > : . 

ctt gtt etc tec gtc ttt gat ccg ate aag tct atg cag etc etc aac 5325 
Leu Val Leu Ser Val Phe Asp Leu lie Lys Ser Met Gin -Leu Leu Asn 
960 . 965 -970 

. 5 . I 

25 aac act gag ggt tct ttc ctt cat aag act atg aaa gcg htt gec gac 5373 
Asn Thr Glu Gly Ser Phe Leu His Lys Thr Met Lys Ala Leu Ala Asp 
975 980 j 985 : 

: ) i 

atg ccc acc aa gtaggtttcc tcttgtagtc tacaaactat tgttgcgatg 5424 
30 Met Pro Thr Lys 

990 : 
tgttgacaaa gactccgttt ccgatctata g ; g get cct ttg gec age aag gtg 5477 

Ala Pro Lau Ala ser Lys Val 
995 
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atg acg age the aac aac teg aag gag gtt eag gac gga etc ttg 6129 
Met Thr Ser Phe Asn Asm Leu Lys Glu Val- Gin Asp Giy Leu Leu 
1160 1165 ; 1170 



3 aafc gtt ctg tct ttc ttc cct get tac cat cat caa gat etc act 
Asn val Leu Ser Phe Phe Pro Ala Tyr His His Gin Asp Phe Thr 

1175 1180 : 1185 j 



tgg gec aag agt gec gag teg ctg gta atg cag atg tct gec gag 
Trp Ala Lys Ser Val Glu Ser Leu Val Met Gin Met Ser Ala Glu 
1220 1225 . « 1230 ': 



20 



ate cag aag aag gga att cga qga gtt acc ttc ttg gtt tgc cga 

lie Gin Lys Lys Gly lie Arg Arg val Thr Phe Leu val Cys Arg 

1235 1240 • •: 1245 \ 

> 

23 aag ggc gtt tac cec tec tac ttc acc ttc aga caa gag ggt gec 

Lys Gly Val Tyr Pro Ser Tyr Phe Thr Phe Arg Gin Glu Gly Ala 

1250 1255 .' 1260 » 
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cag ggc ccc tgg aga gag gag gag «iag att cga aac ate gag cct 
Gin Gly Pro Tzp Arg Glu Glu Glu Lys He Arg Asn He Glu Pro 

1265 1270 . : 1275 j 

get- eta gee agt cag ett gag etc aac cga etc teg aat ttc aag 
Ala Leu Ala Ser Gin Leu Glu Leu isn Arg Leu Ser Asn Phe Lys 
1280 1285 I 1290 ' 

35 •' ; 

gtc acc cct ate ttc gta gac aac aga cag ate cac ate tac aag 
Val Thr pro He Phe Val Asp Asn Arg Gin He His lie Tyr Lys 
1295 1300 : 1305 , 
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caa cga cat ggt cag gac agt gee atg ccc aac get etc aac att 6219 
10 Gin Arg His Gly Gin Asp Ser Ala Met Pro Aan Val Leu Asn He 
1190 1195 .- 1200 : 

get ate egg get ttc gag gag aag gac gac atg tct gat ctt gat 6264 
Ala He 'Arg Ala Phe Glu Glu Lys Asp Asp Met Ser Asp Leu Asp 
13 ' 1205 - 1210 : 1215 j 
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gga gtg ggt aag gag aac tct tec gat gtt cga ttc ttt ate egg 6579 
Gly Val Gly Lys Glu Asn Ser Ser Asp val Arg Phe phe He Arg 
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Val Thr Thr Ser Tyr Tyr Gly Glu T?hr Gly Gly Gly His Arg 
1030 1035 1040 
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gtttgtcctc fceecatgtgt ttctagttca tagctctctg ctgactctpra tccgattttc 5668 

i . 

aacag a aac cct teg gtt gat gtt ctg act gag ate tea aac tct 5713 
Asn Pro Ser Val Asp Val Leu Thr Glu He Ser Asn Ser 
1045 1050 ; 1055 
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cga ttc acc gtc tac gat gtc ctg tec tec ttc ttc aag cac gat 5758 
Arg- Phe Thr Val Tyr Asp Val Leu Ser Ser Phe Phe Ly& His Asp 
1060 1065 \ 1070 
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Asp Pro Txp He Val Leu Ala Ser Leu Thr Val Tyr va : l Leu Arg 

1075 1080 1085 
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gq gtaagtgatc gttcttctcc tcttgccc&a acaatgactg acagttctat 
20 Ala \ 



5855 
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ctattccatc tgcag 



t tac cga gag tac agt att ctt gat ', atg caa cat 
Tyr Arg Glu Tyr Ser He Leu Asp \ Met Gin His 
1090 \ 1095* 
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gag caa ggt cag 
Glu Gin Gly Gin 
1100. 



aag etc aac eag 
Lys Leu Asn Gin 
1115 

teg aat cga gac 
Ser Asn Arg Asp 
1130 



gat ggc get get gga gtc ate jact tgg cga ttc 5949 
Asp Gly Ala Ala Gly Val He Thr Trp Arg Phe 
1105 : ; 1110 : 
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/ ♦ 

? 

\ • 

ccc ate get gag tct tct act ccc ega gtt gac 5994 
Pro He Ala Glu Ser Ser Thr Pro Arg Val Asp 

1120 I I 1125 '• 

: : 1 
gtt tac cga gtc ggt teg ctt tct gat ttg acc 6039 
Val Tyr Arg val Gly Ser Leu Ser Asp Leu Thr 

1135 I 1140 ' 
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Tyr Lys He Lys 
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Thr Glu Pro Leu Arg 
1 1155 
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gct ttg gtt cga cct gga egg gtc cag gga teg afcg aag get gee 6624 
Ala Leu Val Arg Pro Gly Arg Val Gin Gly Ser Met Lys Ala Ala 
1325 1330 1335 
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1340 1345 1350 
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1385 1390 1395 
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1400 1405 
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1415 1420 1425 
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1430 

35 
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Gly phe 
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40 gtc gtg aag tac cac gec tac cag gag gtt gag acc gag aag ggt 7097 
Val val Lys Tyr His Ala Tyr Gin Glu Val Glu Thr Glu Lys Gly 
1440 1445 1450 
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act acc ate ttg aag tea ate gga gac ctt gga cct ctt cac ctt 
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Gin Pro Val Asn His Ala Tyr Gin Thr Lys Asn Ser Leu Gin Pro 
1470 1475 1460 
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cga cga tac cag. get cac ttg gtt gga acg act tac gtc t 
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Tyr Asp Tyx Pro Asp Leu Phe Val Gin Ser Leu Arg Lys 
. 1495 1500 1505 
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get tgg get gag get get get aag att cct cac etc egg gtg cct 
20 Val Trp Ala Glu Ala Ala Ala Lys lie Pro His Leu Arg Val Pro 
1510 1515 1520 
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Ser Glu Pro Leu Thr Ala Thr Glu Leu Val Leu Agp Glu Asn Asn 
1525 ,1530 1535 
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gag ctt cag gag gtc gag cga cct ccg ggt tec aac teg tgt ggt 
Glu Leu Gin Glu Val Glu Arg Pro Pro Gly Ser Asn Ser Cys Gly 
.1540 1545 1550 
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atg gtc gec tgg ate ttc act atg etc act ccc gag tat ccc aag 7513 
Met Val Ala Trp lie Phe Thr Met Leu Thr Pro Glu Tyr Pro Lys 
1555 1560 1565 

ggt cga cga gta gtt gec att gec aac gat ate acc ttc aag att 7558 
Gly Arg Arg val val Ala He Ala Asn Asp He Thr Phe Lys lie 
1570 1575 1580 



40 gga tec ttt ggt cct aag gaa gac gat tac tec ttc aag get act 
Gly Ser Phe Gly Pro Lys Glu Asp Asp Tyr Phe Phe Lys Ala Thr 
1585 1590 1595 



7603 
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gaa att gcc aag aag ctg ggc etc cct cga att cac etc tct gee 7648 
Glu He Ala Lys Lys Leu Gly Leu Pro Arg He Tyr Leu Ser Ala 
1600 1605 1610 

5 aac agt gga get aga etc ggt ate gcg gag gag etc ttg cac ate 7693 
Asn Ser Gly Ala Arg Leu Gly He Ala Glu Glu Leu Leu His He 
1615 1620 1625 

ttc aag gcg gcc ttc gtt gac ccc gca aag cct tec atg ggt att 7738 
10 Phe Lye Ala Ala Phe Val Asp Pro Ala Lys Pro Ser Met Gly He 
1630 1635 1640 

aag tat eta tac ttg ace cct gaa act tta tec act ctt gcc aag 7783 
Lys Tyr Leu Tyr Leu Thr Pro Glu Thr Leu Ser Thr Leu Ala Lya 
15 1645 1650 165£ 

aag gga tec age gtc acc act gag gag ate gag gat gac ggc gag 7828 
iiys Gly Ser Ser val Thr Thr Glu Glu He Glu Asp Asp Gly Glu 
1660 1665 1670 

20 

cga cga cac aag ate ace gcc ate ate ggt ctt gca gag ggt ttg 7873 
Arg Arg His Lys He Thr Ala He He Gly Leu Ala Glu Gly Leu 
1675 1680 1685 

25 gga gtt gag tct ctt cga gga tec ggt ctt att get gga gcc acc 7918 
Gly Val Glu Ser Leu Arg Gly Ser Gly Leu He Ala Gly Ala Thr 
1690 1695 1700 

• » 

act cga get tac gag gag gga ate ttc acc ate tct etc gtt act 7963 

30 Thr Arg Ala Tyr Glu Glu Gly He Phe Thr He Ser Leu Val Thr 
1705 1710 1715 

gcc cga teg gtc ggt ate gga get tac ttg gtt cga ttg ggt cag 8008 

Ala Arg Ser Val Gly Tie Gly Ala Tyr Leu Val Arg Leu Gly Gin 
1720 1725 1730 

35 

cga get att cag gtt gaa ggc aac cct atg ate ctt act gga get 8053 

Arg Ala He Gin Val Glu Gly Asn Pro Met He Leu Thr Gly Ala 
1735 1740 1745 

40 cag tct etc aac aag gtg ctt gga cga gag gtt tac act tec aac 8098 
Gin Ser Leu Asn Lys val Leu Gly Arg Glu val Tyr Thr Ser Asn 
1750 1755 1760 
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ccc cag ctt gga gga acc cag att atg gcc cga aac ggt acc aeg 
Leu Gin Leu Gly Gly Thr Gin He Met Ala Arg Asn Gly Thr Thr 
1765 1770 1775 



8143 



cat etc gtc get gaa tct gat etc gat ggt get etc aag gtc ate 
His Leu Val Ala Glu Ser Asp Leu Asp Gly Ala Leu Lys Val He 
1780 1785 1700 



81B8 



cag tgg etc teg tat gtg ccc gag cga aag ggc aag gcc att cct 
10 Gin Trp Leu Ser Tyr Val Pro Glu Arg Lys Gly Lys Ala He Pro 
1795 1800 • 1805 



8233 



15 



ate tgg cct ccc gag gac cct tgg gac cga act gtg acc tae gag 
He Trp Pro Ser Glu Asp Pro Trp Asp Arg Thr Val Thr Tyr Glu 
1810 1815 1820 



8278 



20 



cct ccc cga ggt cct tae gat cct cga tgg ttg ctt gaa gga aag 
Pro Pro Arg Gly Pro Tyr Asp Pro Arg Trp Leu Leu Glu Gly Lys 
1825 1830 1835 

ccg gat gaa ggc ttg act ggt ctt ttc gac aag gga tct ttc atg 
Pro Asp Glu Gly Leu Thr Gly Leu Phe Asp Lys Gly Ser Phe Met 
1840 1845 1850 



8323 



8368 



25 gag ace ctt gga gat tgg gcc aag act ate gtc acc ggt cga gcc 
Glu Thr Leu Gly Asp Trp Ala Lys Thr He Val Thr Gly Arg Ala 
1855 i860 1865 



8413 



30 



35 



cga ctg gga ggc att cct atg ggt gtt 
Arg Leu Gly Gly He Pro Met Gly Val 

1870 1875 
acg acc gag aag ate ate get. gcc gat 
Thr Thr Glu Lys He lie Ala Ala Asp 
. ...1885 1890 



att get gtc gaa acc agg 
He Ala Val Glu Thr Arg 
1880 

cct gcc aac cct gca get 
pro Ala Asn Pro Ala Ala 
1895 



8458 



8503 



ttc gag caa aag att atg gag get ggt cag gtt tgg aac ccc aac 
Phe Glu Gin Lys He Met Glu Ala Gly Gin Val Trp Asn Pro Asn 
1900 1905 1910 



8548 



40 get get tac aag acc get caa tec ate ttt gat ate aac aag gag 
Ala Ala Tyr Lys Thr Ala Gin Ser He Phe Asp He Asn Lys Glu 
1915 1920 1925 



8593 
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ggt ctt cct ttg atg ate ctt gee aac ate cga ggt ttc tct gga 8638 
Gly Leu Pro Leu Met Xle Leu Ala Asn lie Arg Gly Phe Ser Gly 
1930 1335 1940 

5 gga cag ggt gat atg ttt gac get ate etc aag cag ggt tct aag 8683 
Gly Gin Gly Asp Met Phe Asp Ala He Leu Lys Gin Gly Ser Lys 
1945 1950 1955 

ate gtt gac ggt etc teg aac ttc aag cag cca gtg etc gtc tat 8728 
10 He val Asp Gly Leu Ser Asn Phe Lys Gin Pro Val Phe Val Tyr 
1960 1965 1970 



gtt gtc ccc aac gga gag ctt cgt gga gga get tgg gtc gtg ttg 8773 
Val Val Pro Asn Gly Glu Leu Arg Gly Gly Ala Trp Val Val Leu 
15 1975 1980 1985 



gat cct act ate aac ctt gec aag atg gag atg tac get gat gaa 8818 

Asp Pro Thr He Asn Leu Ala Lys Met Glu Met Tyr Ala Asp Glu 
1990 1995 2000 

20 1 

acc. get cga gga gga att etc gag ccg gaa ggt ate gtt gag ate 8863 

Thr Ala Arg Gly Gly He Leu Glu Pro Glu Gly He Val Glu lie 
2005 2010 2015 

25 aag ttc cga cga gac aag gtc ate get acc atg gag cga ttg gac 8908 
Lys Phe Arg Arg Asp Lys Val He Ala Thr Met Glu Arg Leu Asp 
2020 2025 2030 



gag acc tat gec tct etc aaa get gec teg aac gac tea acc aag 8953 
30 Glu Thr Tyr Ala Ser Leu Lys Ala Ala Ser Asn Asp Ser Thr Lys 

2035 2040 2045 

tct gcg gag gag cga get aag agt get gag eta etc aag gca aga 8998 
Ser Ala Glu Glu Arg Ala Lys Ser Ala Glu Leu Leu Lys Ala Arg 

2050 2055 2060 

35 

gag act eta ctt caa ccg acg tac ttg cag att gca cac ctt tac 9043 
Glu Thr Leu Leu Gin Pro Thr Tyr Leu Gin He Ala His Leu Tyr 
2065 2070 2075 



40 get gat etc cat gat cgt gtc gga cga atg gag gee aag ggt tgc 
Ala Asp Leu His Asp Arg Val Gly Arg Met Glu Ala Lys Gly Cys 
2080 2085 2090 



9088 
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gcg aag cga get gtc tgg get gag get cga cga ttc ttc tac tgg 9133 
Ala Lys Arg Ala Val Tip Ala Glu Ala Arg Arg Phe Phe Tyr Trp 
209S 2100 2105 

5 cga ctt cga cga cgt etc aac gat gag gtgagccgtc ccattcactc 9180 
Arg Leu Arg Arg Arg Leu Asn Asp Glu 
2110 2115 

tttcgttgca aggttcagta gtactaaccg cttctttctt tafcetatcag cac ate 9236 
10 His 116 



ctg tct aag ttc get get gec aac ccg gat ctt act etc gag gag 9281 
Leu Ser Lys Phe Ala Ala Ala Apn Pro Asp Leu Thr Leu Glu Glu 
15 2120 2125 2130 

cga caa aac att etc gac tct gtc gtc cag act gac etc act gat 9326 

Arg Gin Asn He Leu Asp Ser Val Val Gin Thr Asp Leu Thr Asp 
2135 2140 2145 

20 

gac cga gec acc get gaa tgg att gag cag tct gca gaa gag att 9371 

Asp Arg Ala Thr Ala Glu Trp He Glu Gin Ser Ala Glu Glu He 
2150 2155 2160 

25 get get gec gtt gec . gaa gtc cga : tec ace tac gtg teg aat aag 9416 
Ala Ala Ala Val Ala Glu Val . Arg Ser Thr Tyr Val Ser Asn Lys 
2165 2170 2175 

att ate age ttc gee gag acg gag cga get gga gcg ttg cag gge 9461 
30 lie He Ser Phe Ala Glu Thr Glu Arg Ala Gly Ala Leu Gin Gly 
2180 2185 2190 

ttg gtc get gtc ttg age act r ttg aat gcg gaa gac aag aag gee 9506 
Leu Val Ala Val Leu Ser Thr Leu Asn Ala Glu Asp Lys Lys Ala 
2195 o 2200 2205 

35 

ctt gtt tct age ctt ggt etc tea attttaactt nttttgtcga tgetattett 9560 
Leu val ser Ser Leu Gly Leu 
2210 

40 cetatcttta gtctttgatn aacttttgaa tatccttcat agatctttcc ttgeatacat 9620 
tgatattatt tcctcacccg tttttatgta cttccatacg agtttccatt tttttctget 9680 
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tttatactcc gactacacgt cgactgttca cctgcctctc ttttgttctt tctgttctgt 9740 

tttcttctgt tctttcgect cttgggattc tatactctec tccgcatr.ta cacatgctca 9800 

5 tgttaaegtc tgactcagag ttcaccagga tatgtcgtga gagcccgaaa caagttgcac 9860 

aacatatatt gataatgatc agaacactct aagaccaccc agtccatgat cagccgcatc .9920 

10 gccagtttcg atctcttctc cattctcatc aacctcaatc tcctcccgga tcgtcctgcc 9980 

cagcagactg ccgaataact cgecgacctg ctcctcctgc cacaagtcct ccgttcgctc 10040 

aggaaccatg aagttcatga tcttttcttg gggggtatat cgaagcttgc gacctttaga 10100 

15 

agctcgtgta tcgagggtgg gcttgtgctt tctgggtccg taattggaaa aggttgcttg 10160 

> 

gcccatttca aaataaacga aattgatgat tatacaccgc cgtagaccgt ttctggtcag 10220 

20 gattttgtgt tggacgatga tataccgatc gatgtttgag cagacaaggg agttaggaag 10280 

agactactta ceacteatag cgccgacccc agcacctcca cctcttcgct cgatgacgtc 10340 

tctgaccaag ctctggtaaa actettfcgtc atcaccccaa acggcggcct cacattcagc 10400 

25 

ctcatcctga gagaogagtc ccatgaaccg atctactttt tteetacccfc ctagaccecc. 10460 

aagggaagct ccaatttgct cgacgactcc gatcttgacg gatttaaact tttcacctcg 10520 

30 aagattctga aggccctgag cggtcataat cttggaagac c 10561 



<210> 2 

<211> 6645 

35 <212> DNA 

<213> Phaffia rhodp2yma 
<220> 

<22i> CDS 

<222> (1)..(6645) 
40 <223> 
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10 



15 



25 



<400> 2 

atg gtt gtc gat cac gag age gta agg cat bte ate ggt gga aac gca 
Met Val Val Asp His Olu Ser val Arg His Phe lie Gly Gly Asn Ala 
15 io is 



30 



ate gtt gac att gec gag cga ttc aat ata cat get gtt tgg get gga 
lie Val Asp lie Ala Olu Arg Phe Asn lie His Ala Val Trp Ala Gly 
115 120 125 



48 



ctt gag aac gec cct ccg tea age gtc acc gat ttc gtt aga agt caa 96 
Lqu Glu Asn Ala Pro Pro Ser Ser Val Thr Asp Phe Val Arg Ser Gin 
20 25 ao 

gat ggt cac acg gtc ate acc aaa gtc etc att gee aac aac gga ate 144 
Asp Gly His Thr Val He Thr Lys Val Leu He Ala Asn Asn Gly lie 
35 40 45 

get get gta aaa gag ate cga tea gtt cgt aaa tgg get tac gag acg 192 
Ala Ala Val Lys Glu He Arg Ser Val Arg Lys Trp Ala Tyr Glu Thr 
50 55 go 



ttt gga gat gag cga gec ate gaa ttt acg gta atg gec act cca gaa 240 
Phe Gly Asp Glu Arg Ala He Glu Phe Thr Val Met Ala Thr Pro Glu 
20 65 7 0 75 8 0 

gat ttg aag gtg aac tgc gac tat att cga atg get gat cga gtc gtc 288 
Asp Leu Lys val Asn Cys Asp Tyr He Arg Met Ala Asp Arg Val Val 
85 90 g5 



gaa gtt cct gga gga act aac aac aac aat cac tct aac gtc gac etc 336 
Glu Val Pro Gly Gly Thr Asn Asn Asn Asn His Ser Asn Val Asp Leu 
100 105 no 



384 



35 



40 



tgg ggt cac get teg gaa aac ccc aga ctt ccc gag tct etc gec gee 432 
Trp Gly His Ala Ser Glu Asn Pro Arg Leu Pro Glu Ser Leu Ala Ala 
130 135 140 



tea aag aac aag ate gtc ttc att ggt cct ccc gga tec get atg cga 
Ser Lys Asn Lys He Val Phe He Gly Pro Pro Gly Ser Ala Met Arg 
145 150 155 160 



480 



tec ctt gga gac aag att tct teg acc ate gtt gec cag tct gec cag 
Ser Leu Gly Asp Lys He Ser Ser Thr He val Ala Gin Ser Ala Gin 
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165 170 175 

gtg ccg tgt atg gcc tgg tct gga tea ggc ate act gat aca gag etc 576 
val Pro Cys Met Ala Trp Ser Gly Ser Gly lie Thr Asp Thr Glu Leu 
5 180 185 190 



10 



age cct cag ggc ttc gtg act gtg ccc gat ggg cca tat cag get get 624 
Ser Pro Gin Gly Phe Val Thr Val Pro Asp Gly Pro Tyr Gin Ala Ala 
195 200 205 

tgt gta aag acg gtg gag gat ggt ttg gtg cga gcc gag aag ate ggt 672 
Cys Val Lys Thr Val Glu Asp Gly Leu Val Arg Ala Glu Lys He Gly 
210 215 220 



15 ttg cca gtt atg ate aag gcc tct gag gga gga gga gga aag ggt ate 720 
i»eu Pro val Met He Lys Ala Ser Glu Gly Gly Gly Gly Lys Gly He 
225 230 235 240 

cga atg gtt cac age atg gac aca ttc aag aac tec tac aac tec gtc 768 
20 Arg Met Val His Ser Met Asp Thr Phe Lys Asn Ser Tyr Asn Ser Val 

245 250 255 

get tec gag gtg cca gga tec ccg att ttc ate atg gcc ttg get gga 816 
Ala Ser Glu Val Pro Gly Ser Pro He Phe He Met Ala Leu Ala Gly 
25 260 265 270 

tct get cga cat ttg gag gtc cag etc ctt get gat cag tac gga aac 864 
Ser Ala Arg His Leu Glu Val Gin Leu Leu Ala Asp Gin Tyr Gly Asn 
275 280 285 

30 

get ate tct ttg ttc ggt cga gat tgc tct gtt cag cga cga cat cag 912 
Ala He Ser Leu Phe Gly Arg Asp Cy9 Ser Val Gin Arg Arg His Gin 

290 295 300 

aag ate att gag gag get ccc gtc acg ate get egt cca gag aga ttc 960 
35 Lys He He Glu Glu Ala Pro Val Thr He Ala Arg Pro Glu Arg Phe 
305 310 315 320 

gaa gag atg gag aag get get gtc agg ttg gee aag tta gta gga tat 1008 
Glu Glu Met Glu Lys Ala Ala Val Arg Leu Ala Lys Leu val Gly Tyr 
40 325 330 335 

gtt agt gcc ggt ace gtc gaa tac etc tac tct cac gcc gac gac tea 1056 
Val Ser Ala Gly Thr Val Glu Tyr Leu Tyr Ser His Ala Asp Asp Ser 
340 345 350 
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10 



gat ttc gag cag aac gee att ace ace get tgg ttg gat ggg ttg ate 1632 
Asp Phe Glu Gin Asn Ala He Thr Thr Ala Trp Leu Asp Gly Leu He 
530 535 540 

act aac aag ctt aca tefc gag agg cct gat cca tea ctg gee gtfc att .1680 
Thr Asn Lys Leu Thr Ser Glu Arg Pro Asp Pro Ser Leu Ala val He 
545 550 555 560 

tgt ggt gca att gtg aaa get cac gtg get tct gag aae tgt tgg gec 1723 
Cys Gly Ala He Val Lys Ala His Val Ala Ser Glu Asn Cys Trp Ala 
565 570 575 



gaa tac cga cga gta ttg gac aag gga cag gtt ccc tec aag gac act 1776 
Glu Tyr Arg Arg Val Leu Asp Lys Gly Gin Val Pro Ser Lys Asp Thr 
15 590 585 590 

etc aag aca gtg ttc act ctt gat ttc ate tat gag ggt gtt egg tac 1924 
Leu Lys Thr Val Phe Thr Leu Asp Phe He Tyr Glu Gly val Arg Tyr 
595 600 605 

20 

aat ttc acc get get cga gec tec etc aac act tac cga ttg tat eta 1872 
Asn Phe Thr Ala Ala Arg Ala Ser Leu Asn Thr Tyr Arg Leu Tyr LOU 
610 615 620 

25 aac gga gga aag acc gtg gtg tec ate cga cct ttg gec gat ggt gga 1920 
Asn Gly Gly Lys Thr val Val Ser He Arg Pro Leu Ala Asp Gly Gly 
625 630 635 640 

atg etc gtt etc etc gat ggc cga tec cac act etc tac tgg agg gag 1968 
30 Met Leu Val Leu Leu Asp Gly Arg Ser His Thr Leu Tyr Trp Arg Glu 

645 650 655 

gaa gtc ggt acc etc cga act cag gta gac gca aag act tgc ctg att 2016 
Glu Val Gly Thr Leu Arg He Gin Val Asp Ala Lys Thr Cys Leu He 
660 665 670 

35 

gag cag gag aac gac ccc act cag etc cga tea ccc teg cct gga aag 2064 
Glu Gin Glu Asn Asp Pro Thr Gin Leu Arg Ser Pro Ser Pro Gly Lys 
675 680 685 

40 ate ate egg ttt ttg gtc gaa age gga gat cac ate tec tec gga gat 2112 
He He Arg Phe Leu Val Glu Ser Gly Asp His He Ser Ser Gly Asp 
690 69S 700 
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10 



15 



25 



30 



. ttc ttc tee cte.gaa etc aac cct cga ctt caa gtc gag cac cct act. . 1104 
Phe Phe Phe Leu Glu Leu Asn Pro Arg Leu Gin val Glu His Pro Thr 
335 360 365 

acc gag atg gtc teg ggt gtc aac etc ccc get get cag ctt cag att 1152 
Thr Glu Met val Ser Gly Val Asn Leu Pro Ala Ala Gin Leu Gin He 
370 375 380 

get atg ggt ate cct ctt tct cga att egg gat att ega gtc etc tac 1200 
Ala Met Gly lie Pro Leu Ser Arg He Arg Asp He Arg Val Leu Tyr 
305 390 39S 400 

ggt etc gat ccc cac act gtt tec gag ate gac ttc gac age age aga 1248 
Gly Leu Asp. Pro Hla Thr Val Ser Glu lie Asp Phe Asp Ser Ser Arg 
405 410 415 



gcg gag tet gtc. cag . act cag agg aag cct agg ccc aag ggt cac gtc 1296 
Ala Glu Ser Val Gin Thr Gin Arg Lys Pro Arg Pro Lys Gly His Val 
20 420 425 430 

att gcc .tgt cga ate acg agt gaa aac ccc gat gag ggg ttc aag ccg 1344 
He Ala Cys Arg He Thr Ser Glu Aan Pro Asp Glu Gly Phe Lys Pro 
435 440 445 



tct gee gga gat ate caa gag ttg aac ttc aga agt aat act aac gtc 1392 
Ser Ala Gly Asp He Gin Glu Leu Asn Phe Arg Ser Asn Thr Asn Val 
450 455 460 



tgg gga tac ttc tct gtt gga get act gga gga att cat agt ttc gee 1440 
Trp Gly Tyr Phe Ser val Gly Ala Thr Gly Gly He His Ser Phe Ala 
4g 5 470 • 475 4 8 o 

gat tct caa ttc ggt cac gtg ttt get tat ggc tec gac cga acg act 1488 
Asp Ser Gin Phe Gly His Val Phe Ala Tyr Gly Ser Asp Arg Thr Thr 
33 485 490 495 



gec aga aag aat atg gtt ate gee ttg aaa gag ctt tec att cga gga 1536 
Ala Arg Lys Asn Met Val He Ala Leu Lys Glu Leu Ser He Arg Gly 
500 505 510 

40 

gae ttc cga acc act gtc gag tat ctt ate act ctt cct gag acg age 1584 
Asp Phe Arg Thr Thr Val Glu Tyr Leu He Thr Leu Leu Glu Thr Ser 
515 520 525 
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atc tat get gag gtt gag gtc atg aag atg ate ttg ccc ttg att gec 2160 
He. Tyr Ala Glu Val Glu Val Mat Lys Met He Leu Pro Leu lie Ala 
70S 710 715 720 

5 cag gag tec ggt cac gtt eag ttt gtc aag caa gee ggt gtg ace gtc 2208 
Gin Glu Ser Gly His Val Gin Phe Val Lys Gin Ala Gly Val Thr Val 
725 730 735 

gat cct gga gcg att att ggg ate ttg agt ctt gat gae ccc acg cga 2256 
10 Asp Pro Gly Ala , He He Gly He Leu Ser Leu Asp Asp Pro Thr Arg 
740 745 750 

gtg aag aag gcg aag ccc ttc gag ggt etc ctg cct gtg act ggt etc 2304 
val Lys Lys Ala Lys Pro Phe Glu Gly Leu Leu Pro Val Thr Gly Leu 
15 755 760 765 

cct aac ctg ccc ggt aac aga cct cac cag egg eta cag ttc cag ctt 2352 
Pro Asn Leu Pro Gly Asn Arg Pro His Gin Arg Leu Gin Phe Gin Leu 
770 775 780 

20 

gag teg ata tac teg gtc ttg gat gga tac gag agt gac tec act gca 2400 
Glu Ser He Tyr Ser Val Leu Asp Gly Tyr Glu Ser Asp Ser Thr Ala 
785 790 795 800 

25 aca ate etc cga tea ttc tct gaa aac ctt tat gat cct gat ctt get 2448 
Thr lie Leu Arg Ser Phe Ser Glu Asn Leu Tyr Asp Pro Asp Leu Ala 
805 810 815 

ttc gga gag get tta tec ate att tec gtc ctt tct ggg aga atg cct 2496 
30 Phe Gly Glu Ala Leu Ser He He Ser Val Leu Ser Gly Arg Met Pro 
820 825 830 

gec gat ctt gag gag age att cga gag gtc ate age gaa get cag teg 2544 
Ala Asp Leu Glu Glu Ser He Arg Glu Val He Ser Glu Ala Gin Ser 
835 840 845 

35 

aag cct cac gec gag ttc cct gga tea aag ate etc aaa gtc gtc gag 2592 
Lys Pro His Ala Glu Phe Pro Gly Ser Lys He Leu Lys Val Val Glu 
850 855 860 

40 egg tac ate gat aat ttg cga cct cag gag agg get atg gtc cga act 2640 
Arg Tyr He Asp Asn Leu Arg Pro Gin Glu Arg Ala Met Val Arg Thr 
865 870 875 880 
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cag ate gaa ccc ate gtt ggt att get gag aag aac gtt ggc ggt cct 
Gin lie Glu Pro lie Val Gly He Ala Glu Lys Asn Val Gly Gly Pro 
885 890 895 



268a 



aag ggt tac gec tct tac gtc tta get acc ate ctt caa aag ttc ttg 
Lys Gly Tyr Ala Ser Tyr Val Leu Ala Thr He Leu Gin Lys Phe Leu 
900 905 910 



2736 



gec gtt gag gec gtt ttt get act ggt agt gaa gag gec att gtt etc 
10 Ala Val Glu Ala Val Phe Ala Thr Gly Ser Glu Glu Ala lie Val Leu 
915 920 925 



2784 



15 



caa ctt cga gat gaa aac cga gaa tct ttg aac gac gtc ctt ggt etc 
Gin Leu Arg Asp Glu Asn Arg Glu Ser Leu Asn Asp Val Leu Gly Leu 
930 935 940 



2832 



20 



gtc ctg get cac teg cgt etc age get cga tec aag ctt gtt etc tec 2880 
Val Leu Ala His Ser Arg Leu Ser Ala Arg Ser Lys Leu Val Leu Ser 
945 950 955 960 

gtc ttt gat ctg ate aag tct atg cag etc etc aac aac act gag ggt 2928 
Val Phe Asp Leu He Lys Ser Met Gin Leu Leu Asn Asn Thr Glu Gly 
965 970 975 



25 tct ttc ctt cat aag act atg aaa gcg ctt gec gac atg ccc acc aag 
Ser Phe Leu His Lys Thr Met Lys Ala Leu Ala Asp Met Pro Thr Lys 
980 985 990 



2976 



30 



35 



get cct ttg gee age aag gtg tct ttg aag get egg gaa att ctt ate 3024 
Ala Pro Leu Ala Ser Lys val Ser Leu Lys Ala Arg Glu lie Leu He 

995 1000 1005 

tct tgc tct ctt ccc tct tac gag gag agg ttg ttc cag atg gaa 3069 
Ser cys Ser Leu Pro Ser Tyr Glu Glu Arg Leu Phe Gin Met Glu 
1010 1015 1020 

aag acc ctt aac tct tct gtc acc act tct tac tac gga gag act 3114 
Lys He Leu Asn Ser Ser Val Thr Thr Ser Tyr Tyr Gly Glu Thr 
1025 1030 1035 



40 gga ggt gga cac aga aac cct teg gtt gat gtt ctg act gag ate 
Gly Gly Gly His Arg Asn Pro Ser Val Asp Val Leu Thr Glu He 
1040 1045 1050 



3159 
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tea aac tct cga ttc acc gtc tac gat gtc ctg tec tec ttc ttc 3204 
Ser Asn Ser Arg Phe Thr Val Tyr Asp Val Leu Ser Ser Phe Phe 
105S 1060 1065 

5 aag cac gat gat cct tgg att gtt ctt get agt ttg acc gtc tac 3249 
Lys His Aep Asp Pro Trp lie Val Leu Ala Ser Leu Thr Val Tyr 
1070 1075 1080 

gtt ctt cga get tac cga gag tac agt att ctt gat atg caa cat 3294 
10 Val Leu Arg Ala Tyr Arg Glu Tyr Ser He Leu Asp Met Gin His 
1085 1090 1095 

gag caa ggtf cag "gat ggc get* get gga gtc ate act tgg cga ttc 3339 
Glu Gin Gly Gin Asp Gly Ala Ala Gly Val He Thr Trp Arg Phe 
15 1100 1105 , mo 

aag etc aac cag ccc ate get gag tct tct act ccc cga gtt gac 3384 

Lys Leu Asn Gin Pro He Ala Glu Ser Ser Thr Pro Arg Val Asp 
1115 1120 H25 

20 

teg aat cga gac gtt tac cga gtc ggt teg ctt tct gat ttg acc 3429 

Ser Asn Arg Asp Val Tyr Arg Val Gly Ser Leu Ser Asp Leu Thr 
1130 1135 H40 

25 tac aag ate aag cag agt cag acc gag ccc etc cga get ggt gtc 3474 
Tyr Lys He Lys Gin Ser Gin Thr Glu Pro Leu Arg Ala Gly Val 
1145 1150 . 1155 

atg acg age ttc aac aac ttg aag gag gtt cag gac gga etc ttg 3519 
30 Met Thr Ser Phe Asn Asn Leu Lys Glu Val Gin Asp Gly Leu Leu 
1160 1165 1170 

aat gtt ctg tct ttc ttc cct get tac cat cat caa gat ttc act 3564 
Asn Val Leu Ser Phe Phe Pro Ala Tyr His His Gin Asp Phe Thr 
1175 U80 H85 

35 

caa cga cat ggt cag gac agt gee atg ccc aac gtt etc aac att 3609 
Gin Arg His Gly Gin Asp Ser Ala Met Pro Asn Val Leu Asn He 
1190 H35 1200 

40 get ate egg get ttc gag gag aag gac gac atg tct gat ctt gat 3654 
Ala He Arg Ala Phe Glu Glu Lys Asp Asp Met Ser Asp Leu Asp 
1205 1210 1215 
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cgg gcc aag agt gtt gag teg ctg gta atg cag atg tct gcc gag 
T*P Ala Lys Ser Val Glu Ser Leu Val Met Gin Met Ser Ala Glu 
1220 1225 1230 



ate cag aag aag gga att cga cga gtt acc ttc tfcg gtt tgc cga 
He Gin Lys Lys Gly lie Arg Arg Val Thr Phe Leu Val Cys Arg 
1235 1240 1245 



3744 



aag ggc gtt cac ccc tec tac ttc acc ttc aga caa gag ggt gcc 
10 Lys Gly Val Tyr Pro Ser Tyr Phe Thr Phe Arg Gin Glu Gly Ala 
1250 1255 1260 



3789 



15 



cag ggc ccc tgg aga gag gag gag aag att cga aac acc gag cct 
Gin Gly Pro Trp Arg Glu Glu Glu Lys He Arg Asn He Glu Pro 
1265 1270 1275 



3834 



20 



get eta gcc agt cag ctt gag etc aac cga etc teg 
Ala Leu Ala Ser Gin Leu Glu Leu Asn Arg Leu Ser 
1280 1285 1290 

gtc acc cct ate ttc gta gac aac aga cag ate cac 
Val Thr Pro He Phe val Asp Asn Arg Gin He His . 
1295 1300 1305 



aat ttc aag 
Asn Phe Lys 



acc tac aag 
He Tyr Lys 



3879 



3924 



25 gga gtg ggt aag gag aac tct tec gat get cga ttc ttt ate egg 
Gly Val Gly Lys Glu Asn Ser Ser Asp Val Arg Phe Phe He Arg 
1310 1315 1320 



3369 



30 



35 



get ttg 
Ala Leu 

132S 
gag tat 
Glu Tyr 

1340 

gac gcc 
Asp Ala 
1355 



gtt cga cct gga egg 
Val Arg Pro Gly Arg 
1330 

etc ate tec gag tgc 
Leu He Ser Glu Cys 
1345 



gtc cag gga teg atg 
Val Gin Gly Ser Met 
1335 

gat cga ctg etc act 
Asp Arg Leu Leu Thr 
1350 



ttg gag gtt gtt gga gcc gag act cga aac 
Leu Glu Val Val Gly Ala Glu Thr Arg Asn 
1360 1365 



aag get gcc 
Lys Ala Ala 

gat ate ctg 
Asp He Leu 



gcc gat tgc 
Ala Asp Cys 



4014 



4059 



4104 



40 aac cat gtt gga att aac ttc ate tat aac gtt ctt gtc gac ttc 
Asn His val Gly He Asn Phe He Tyr Asn Val Leu Val Asp Phe 
1370 1375 1380 



4145 
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gac gac gtc cag gag gcc ctt gcc ggg tec att gag agg cac gga 4194 
Asp Asp val Gin Glu Ala Leu Ala Gly Phe lie Glu Arg His Gly 
1385 " * * 1390 1395 

5 aag agg ctt tgg cga ctt cga gtg acc get tct gaa ate cga atg 4239 
Lys Arg Leu Trp Arg Leu Arg Val Thr Ala Ser Glu lie Arg Mac 
1400 1405 1410 

gtt ctt gag gac gac gag ggt aac gtc acc ccc ate cga tgc tgc 4284 
10 val Leu Glu Asp Asp Glu Gly Asn Val Thr Pro lie Arg Cys Cys 
1415 1420 142S 

art gag aac gtt tct ggt ttc gtc gtg aag tac cac gcc tac cag 4329 
He Glu Asn val Ser Gly Phe val Val Lys Tyr His Ala Tyr Gin 
15 1430 1435 1440 

gag gtt gag acc gag aag ggt act acc ate ttg aag tea ate gga 4374 
Glu val Glu Thr Glu Lys Gly Thr Thr lie Leu Lys Ser He Gly 
1445 1450 1455 

20 

gac ctt gga cct ctt cac ctt cag cct gtc aac cat get tac cag 4419 
Asp Leu. Gly Pre Leu His Leu Gin Pro val Asn His Ala Tyr Gin 
1460 1465 1470 

25 acc aag aac agt ctt cag ccc cga cga tac cag get cac ttg gtt 4464 
Thr Lys Asn Ser Leu Gin Pro Arg Arg Tyr Gin Ala His Leu Val 
1475 1480 1485 

gga acg act tac gtc tac gae tac ccc gat etc ttc gtt cag agt 4509 
30 Gly Thr Thr Tyr Val Tyr Asp Tyr Pro Asp Leu Phe Val Gin Ser 

1490 1495 1500 

ttg cgc aag gtt tgg get gag get get get aag att cct cac etc 4554 
Leu Arg Lys Val Trp Ala Glu Ala Ala Ala Lys He Pro His Leu 

1505 1510 1515 

35 

egg gtg cct age gag cct ctt acc get acc gag ctg gtt etc gat 4599 
Arg Val Pro Ser Glu Pro Leu Thr Ala Thr Glu Leu Val Leu Asp 
1520 1525 1530 

40 gag aac aac gag ctt cag gag gtc gag cga cct ccg ggt tec aac 4644 
Glu Asn Asn Glu Leu Gin Glu Val Glu Arg Pro Pro Gly Ser Asn 
1535 1540 1545 
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teg tgt ggt atg gtc gec egg ate ttc act atg etc act ccc gag 4689 
Ser Cys Gly Met Val Ala Trp lie Phe Thr Met Leu Thr Pro Glu 
1550 1555 1560 

5 tat ccc aag ggt cga cga gta gtt gec att gec aac gat ate acc 4734 
Tyr Pro Lys Gly Arg Arg Val Val Ala He Ala Asn Asp He Thr 
1565 1570 1S75 

ttc aag att gga Ccc ttt ggt cct aag gaa gac gat tac ttc ttc 4779 
10 Phe Lys He Gly Ser Phe Gly Pro Lys Glu Asp Asp Tyr Phe Phe 
1580 1585 1590 

aag get acc gaa att gec aag aag etg ggc cct cct cga att tac 4824 
Lys Ala Thr Glu He Ala Lys Lys Leu Gly Leu Pro Arg He Tyr 
15 1595 1600 1605 

etc tct gec aac agt gga get aga etc ggt ate gcg gag gag etc 4869 

Leu Ser Ala Asn Ser Gly Ala Arg Leu Gly He Ala Glu Glu Leu 
1610 1615 1620 

20 

ttg cac ate ttc aag gcg gee ttc gtt gac ccc gca aag cct tec 4914 

Leu His He Phe Lys Ala Ala Phe Val Asp Pro Ala Lys Pro Ser 
1625 1630 1635 

25 atg ggt att aag tat eta tac ttg acc cct gaa act tta tec act 4959 
Met Gly He Lys Tyr Leu Tyr Leu Thr Pro Glu Thr Leu Ser Thr 
1640 1645 1650 

ctt gec aag aag gga tec age gtc acc act gag gag ate gag gat 5004 

30 Leu Ala Lys Lys Gly Ser Ser Val Thr Thr Glu Glu He Glu Asp 

1655 1660 1665 

gac ggc gag cga cga cac aag ate acc gec ate ate ggt ctt gca. 5049 

Asp Gly Glu Arg Arg His Lys He Thr Ala He He Gly Leu Ala 

1670 1675 1680 



35 



gag ggt ttg gga gtt gag tct ctt cga gga tec ggt ctt att get 5094 
Glu Gly Leu Gly Val Glu Ser Leu Arg Gly Ser Gly Leu He Ala 
1685 1690 1695 



40 



gga gee 
Gly Ala 
1700 



acc act cga get tac gag gag gga ate ttc acc ate tct 
Thr Thr Arg Ala Tyr Glu Glu Gly He Phe Thr He Ser 
1705 1710 



5139 
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cte get act gec cga teg gtc ggt ate gga get tac ttg gtt cga 5184 
Leu Val Thr Ala Arg Ser Val Gly tie Gly Ala Tyr Leu Val Arg 
1715 1720 1725 

S ttg ggt cag cga get att cag gtt gaa ggc aac cct atg ate ctt 5229 
Leu Gly Gin Arg Ala He Gin Val Glu Gly Asti Pro Mat He Leu 
1730 1735 1740 

act gga get cag tct etc aae aag gtg cct gga cga gag gtt tac 5274 
10 Thr Gly Ala Gin Ser Leu Asn Lys Val Leu Gly Arg Glu Val Tyr 
1745 1750 1755 

act tec aae ctt cag ctt gga gga acc cag att atg gee cga aac 5313 
Thr Ser Asn Leu Gin Leu Gly Gly Thr Gin He Mat Ala Arg Asn 
15 1760 1765 1770 

ggt acc acg cat etc gtc get gaa tct gat etc gat ggt get etc 5364 
Gly Thr Thr His Leu Val Ala Glu Ser Asp Leu Asp Gly Ala Leu 
1775 1780 1785 

20 

aag gtc ate cag tgg etc teg tat gtg ccc gag cga aag ggc aag 5409 
Lys Val He Gin Trp Leu Ser Tyr Val Pro Glu Arg Lye Gly Lys 
1790 1795 1800 

25 gee att. cct ate tgg cct tec gag gac ect tgg gac cga act gtg 5454 
Ala He Pro He Trp Pro ser Glu Asp Pro Trp Asp Arg Thr Val 
1B05 1810 1815 

acc tac gag cct ccc cga ggt cet tac gat cct cga tgg ttg ctt 5499 
30 Thr Tyr Glu Pro Pro Arg Gly Pro Tyr Asp Pro Arg Trp Leu Leu 

1B20 1825 1830 

gaa gga aag ccg gat gaa ggc ttg act ggt ctt ttc gac aag gga 5544 
Glu Gly Lys Pro Asp Glu Gly Leu Thr Gly Leu Phe Asp Lys Gly 

1835 1040 1845 

35 

tct ttc atg gag acc ctt gga gat tgg gee aag act ate gtc acc 5589 
Ser Phe Met Glu Thr Leu Gly Asp Trp Ala Lys Thr He Val Thr 
1B50 1855 i860 



40 



ggt cga gee cga ctg gga ggc att cct atg ggt gtt att get gtc 
Gly Arg Ala Arg Leu Gly Gly He Pro Met Gly Val He Ala Val 
1865 1870 1875 



5634 



-73- 



gaa acc agg acg ace gag aag ate ate get gee gat cct gec aac 
Glu Thr Arg Thr Thr Glu Lys He lie Ala Ala Asp Pro Ala Asn 
1880 1885 1890 



5 cct gca get ttc gag caa aag att atg gag get ggt cag gtt tgg 
Pro Ala Ala Phe Glu Gin Lys He Met Glu Ala Gly Gin Val Txp 
1895 1900 1905 



aac ccc aac get get tac aag acc get caa tec ate ttt gat ate 
10 Asn Pro Asn Ala Ala Tyr Lys Thr Ala Gin Ser He Phe Asp lie 
1910 1915 1920 

aac aag gag ggt ett cct ttg atg ate ctt gec aac ate cga ggt 
Asn Lys Glu Gly Leu Pro Leu Met He Leu Ala Asn He Arg Gly 
15 1925 1930 1935 



ttc tct gga gga cag ggt gat atg ttt gac get ate etc aag cag 

Phe Ser Gly Gly Gin Gly Asp Met Phe Asp Ala He Leu Lys Gin 
1940 1945 1950 

20 

ggt tct aag ate gtt gac ggt etc teg aac ttc aag cag cca gtg 

Gly Ser Lys He Val Asp Gly Leu Ser Asn Phe Lys Gin pro Val 
1955 1960 1965 



25 ttc gtc tat gtt gtc ccc aac gga gag ett cgt gga gga get tgg 
Phe val Tyr Val Val Pro Asn Gly Glu Leu Arg Gly Gly Ala Trp 
1970 1975 1980 



gtc gtg ttg gat cct act ate 

30 Val Val Leu Asp Pro Thr He 

1985 1990 

get gat gaa acc get cga gga 

Ala Asp Glu Thr Ala Arg Gly 

2000 2005 

35 

gtt gag ate aag ttc cga cga 

val Glu He Lys Phe Arg Arg 
2015 2020 



aac ctt gec aag atg gag atg tac 
Asn Leu Ala Lys Met Glu Met Tyr 
1995 

gga att etc gag ceg gaa ggt ate 
Gly He Leu Glu Pro Glu Gly He 
2010 

gac aag gtc ate get acc atg gag 
Asp Lys Val He Ala Thr Met Glu 
2025 



40 



cga ttg gac gag acc tat gec tct etc aaa get gee 
Arg Leu Asp Glu Thr Tyr Ala Ser Leu Lys Ala Ala 
2030 2035 2040 



teg aac gac 
Ser Asn Asp 
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tca acc aag tct gcg gag gag cga get aag agt get gag cca etc 6174 
ser Thr Lys Ser Ala Glii Glu Arg Ala Lys Ser Ala Glu Leu Leu 
204S 2050 2055 

5 aag gca aga gag act eta ctt caa ccg acg tac ttg cag att gca 6219 
Lys Ala Arg Glu Thr Leu Leu Gin Pro Thr Tyr Leu Gin He Ala 
2060 2065 2070 

cac ctt tac get gat etc cat gat cgt gtc gga cga atg gag gec 6264 
10 His Leu Tyr Ala Asp Leu His Asp Arg Val Gly Arg Met Glu Ala 
2075 2080 2085 

aag ggt tgc gcg aag cga gec gtc tgg get gag get cga cga etc 6309 
Lys Gly Cys Ala Lys Arg Ala Val Trp Ala Glu Ala Arg Arg Phe 
15 2090 2095 2100 

ttc tac tgg cga ctt cga cga cgt etc aac gat gag cac ate ctg 6354 
Phe Tyr Trp Arg Leu Arg Arg Arg Leu Asn Asp Glu His Zle Leu 
2105 2110 2115 

20 

tct aag ttc get get gec aac ccg gat ctt act etc gag gag cga 6399 
Ser Lys Phe Ala Ala Ala Asn Pro Asp Leu Thr Leu Glu Glu Arg 
2120 2125 2130 

25 caa aac att etc gac tct gtc gtc cag act gac etc act gat gac 6444 
Gin Asn He Leu Asp Ser Val Val Gin Thr Asp Leu Thr Asp Asp 
2135 * 2140 2145 

cga gec acc get gaa tgg att gag cag tct gca gaa gag att get 6489 
30 Arg Ala Thr Ala Glu Trp He Glu Gin Ser Ala Glu Glu He Ala 

2150 2155 2160 

get gee gtt gec gaa gtc cga tec acc tac gtg teg aat aag att 6534 
Ala Ala Val Ala Glu Val Arg Ser Thr Tyr Val Ser Asn Lys He 

2165 2170 2175 

35 

ate age ttc gee gag acg gag cga get gga gcg ttg cag ggc ttg 6579 
He Ser Phe Ala Glu Thr Glu Arg Ala Gly Ala Leu Gin Gly Leu 
2180 2185 2190 

40 gtc get gtc ttg age act ttg aat gcg gaa gac aag aag gee ctt 6624 
Val Ala Val Leu Ser Thr Leu Asn Ala Glu Asp lys Lys Ala Leu 
2195 2200 2205 
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gtt tct age etc ggt etc eaa 6645 
val Ser Ser Leu Gly Leu 
2210 



5 

<210> 3 
<:211> 2214 
<2l2> PRT 

<213> Phaffia rhodozyma 

10 

<400> 3 

Mec val Val Asp His Glu Ser Val Arg His Phe He Gly Gly Asn Ala 
15 10 15 

15 

Leu Glu Asn Ala Pro Pro Ser Ser Val Thr Asp Phe Val Arg Ser Gin . 
20 25 30 



Asp Gly His Thr Val He Thr Lys Val Leu lie Ala Asn Asn Gly He 
20 35 40 45 

Ala Ala Val Lye Glu He Arg Ser Val Arg Lys Trp Ala Tyr Glu Thr 
50 55 60 

25 Phe Gly Asp Glu Arg Ala He Glu Phe Thr Val Met Ala Thr Pro Glu 
6S 70 75 80 

Asp Leu Lys Val Asn Cys Asp Tyr He Arg Met Ala Asp Arg Val Val 
85 90 95 

30 

Glu val Pro Gly Gly Thr Asn Asn Asn Asn His Ser Asn Val Asp Leu 

100 105 no 

He Val Asp He Ala Glu Arg Phe Asn He His Ala Val Trp Ala Gly 
115 120 125 

35 

Trp Gly His Ala Ser Glu Asn Pro Arg Leu Pro Glu Ser Leu Ala Ala 
130 135 140 

Ser Lys Asn Lys He Val Phe He Gly Pro pro Gly Ser Ala Met Arg 
40 145 150 155 lfi0 

Ser Leu Gly Asp Lys He Ser Ser Thr He Val Ala Gin Ser Ala Gin 
165 170 175 
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Val Pro Cys Met Ala Trp Ser Gly Ser Gly lie Thr Asp Thr Glu teu 
180 185 190 

Ser Pro Gin Gly Phe Val Thr Val Pro Asp Gly Pro Tyr Gin Ala Ala 
5 195 200 205 

Cys Val Lys Thr Val Glu Asp Gly Leu Val Arg Ala Glu Lys lie Gly 
210 215 220 

to Leu Pro Val Met lie Lys Ala Ser Glu Gly Gly Gly Gly Lys Gly lie 
225 230 235 240 

Arg Met Val His Ser Met Asp Thr Phe Lys Asn ser Tyr Asn ser Val 
245 250 255 

15 

Ala Ser Glu Val Pro Gly Ser Pro lie Phe lie Met Ala Leu Ala Gly 
260 265 270 

ser Ala Arg His Leu Glu Val Gin Leu Leu Ala Asp Gin Tyr Gly Asn 
20 2 7 5 2 8 0 2 8 5 

Ala lie Ser Leu Phe Gly Arg Asp Cys Ser Val Gin Arg Arg His Gin 
290 295 300 

25 Lys lie lie Glu Glu Ala Pro val Thr lie Ala Arg Pro Glu Arg 5he 
305 310 315 320 

Glu Glu Met Glu Lys Ala Ala Val Arg Leu Ala Lys Leu val Gly Tyr 
325 330 335 

30 

Val Ser Ma Gly Thr val Glu Tyr Leu Tyr Ser His Ala Asp Asp Ser 
340 345 350 

35 Phe Phe Phe Leu Glu Leu Asn Pro Arg Leu Gin Val Glu His Pro Thr 
355 360 365 

Thr Glu Met Val Ser Gly Val Aan Leu Pro Ala Ala Gin Leu Gin lie 
370 375 380 

40 

Ala Met Gly lie Pro Leu Ser Arg lie Arg Asp He Arg Val Leu Tyr 
385 390 395 400 
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Gly Leu Asp Pro His Thr Val Ser Slu lie Asp Phe Asp Ser Sex Arg 
40S 410 415 

Ala Glu Ser Val Gin Thr Gin Arg Lys Pro Arg Pro Lys Gly His Val 
5 420 425 430 

lie Ala Cys Arg lie Thr Ser Glu Asn Pro Asp Gin Gly Phe Lys Pro 
435 440 445 

10 Ser Ala Gly Asp lie Gin Glu Leu Asn Phe Arg Ser Asn Thr Asn Val 
450 455 460 



15 



Trp Gly Tyr Phe Ser Val Gly Ala Thr Gly Gly He His Ser Phe Ala 
465 4 ™ 475 480 

Asp Ser Gin Phe Gly His Val Phe Ala Tyr Gly Ser Asp Arg Thr Thr 
485 490 4g5 



Ala Arg Lys Asn Met Val He Ala Leu Lys Glu Leu Ser He Arg Gly 
20 500 505 510 

Asp Phe Arg Thr Thr Val Glu Tyr Leu He Thr Leu Leu Glu Thr Ser 
515 520 525 

25 Asp Phe Glu Gin Asn Ala He Thr Thr Ala Trp. Leu Asp Gly Leu He 
530 535 540 



30 



Thr Asn Lys Leu Thr Ser Glu Arg Pro Asp Pro Ser Leu Ala Val He 
545 550 555 560 



Cys Gly Ala He val Lys Ala His Val Ala ser Glu Asn Cys Trp Ala 
565 570 S75 

35 Glu Tyr Arg Arg Val Leu Asp Lys Gly Gin Val Pro Ser Lys Asp Thr 
580 58S 590 



40 



Leu Lys Thr Val Phe Thr Leu Asp Phe He Tyr Glu Gly Val Arg Tyr 
595 600 605 

Asn Phe Thr Ala Ala Arg Ala Ser Leu Asn Thr Tyr Arg Leu Tyr Leu 
610 615 620 
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Asn Gly Gly Lys Thr Val val Ser He Arg Pro Leu Ala Asp Gly Gly 
625 630 635 640 

Met Leu Val Leu Leu Asp Gly Arg Ser His Thr Leu Tyr Trp Arg Glu 
5 645 650 655 

Glu Val Gly Thr Leu Arg He Gin Val Asp Ala Lys Thr Cys Leu He 
660 665 670 

10 Glu Gin Glu Asn Asp Pro Thr Gin Leu Arg Ser Pro Ser Pro Gly Lys 
675 680 685 

He He Arg Phe Leu val Glii Ser Gly Asp His He Ser Ser Gly Asp 
690 695 700 

15 

He Tyr Ala Glu Val Glu Val Met Lys Met He Leu Pro Leu He Ala 
705 710 715 720 

Gin Glu Ser Gly His val Gin Phe Val Lys Gin Ala Gly Val Thr Val 
20 725 730 735 

Asp Pro Gly Ala He He Gly He Leu Ser Leu Asp Asp Pro Thr Arg 
740 745 750 



25 Val Lys Lys Ala Lys Pro Phe Glu Gly Leu Leu Pro Val Thr Gly Leu 
755 760 765 

Pro Asn Leu Pro Gly Asn Arg Pro His Gin Arg Leu Gin Phe Gin Leu 
770 775 7B0 

30 

Glu Ser He Tyr Ser Val Leu Asp Gly Tyr Glu Ser Asp Ser Thr Ala 
785 790 795 800 

33 Thr He Leu Arg Ser Phe Ser Glu Asn Leu Tyr Asp Pro Asp Leu Ala 

805 810 815 



40 



Phe Gly Glu Ala Leu Ser He He Ser Val Leu Ser Gly Axg Met Pro 
820 825 830 

Ala Asp l*eu Glu Glu Ser He Arg Glu Val He Ser Glu Ala Gin Ser 
835 840 845 
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Lys Pro HiB Ala Glu Phe Pro Gly Ser Lys lie Leu Lye Val Val Glu 
850 855 860 

Arg Tyr He Asp Asn Leu Arg Pro Gin Glu Arg Ala Met val Arg Thr 
5 865 870 875 880 

Gin He Glu Pro He Val Gly lie Ala Glu Lys Asn Val Gly Gly Pro 
885 890 895 

10 Lys Gly Tyr Ala Ser Tyr Val Leu Ala Thr He Leu Gin Lys Phe Leu 
900 905 910 

Ala Val Glu Ala Val Phe Ala Thr Gly Ser Glu Glu Ala He Val Leu 
915 920 325 

15 

Gin Leu Arg Asp Glu A9n Arg Glu ser Leu Asn Asp Val Leu Gly Leu 
930 935 940 

Val Leu Ala His Ser Arg Leu Ser Ala Arg Ser Lys Leu Val Leu Ser 
20 945 950 955 960 

Val Phe Abp Leu He Lys Ser Met Gin Leu Leu Asn Asn Thr Glu Gly 
965 970 975 

25 Ser Phe Leu His Lys Thr Met Lys Ala Leu Ala Asp Met Pro Thr Lys ■ 
980 985 990 

Ala Pro Leu Ala Ser Lys Val Ser Leu Lys Ala Arg Glu He Leu He 
995 1000 1005 

30 

Ser Cys Ser Leu Pro Ser Tyr Glu Glu Arg Leu Phe Gin Met Glu 
1010 1015 1020 

35 Lys He Leu Asn Ser Ser Val Thr Thr Ser Tyr Tyr Gly Glu Thr 
1025 1030 1035 

Gly Gly Gly His Arg Asn Pro Ser Val Asp Val Leu Thr Glu lie 
1040 1045 10 50 

40 

Ser Asn Ser Arg Phe Thr Val Tyr Asp Val Leu Ser Ser Phe Phe 
1055 1060 1065 
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Lys Hia Asp Asp Pro Trp He Val Leu Ala Ser Leu Thr Val Tyr 
1070 - 1075 1080 

Val Leu Arg Ala Tyr Arg Glu Tyr Ser lie Leu Asp Met Gin His 
5 1085 1090 1095 

Glu Gin Gly Gin Asp Gly Ala Ala Gly Val He Thr Trp Arg Phe 
U00 1105 1110 

10 Lys Leu Asn Gin Pro He Ala Glu Ser Ser Thr Pro Arg Val Asp 
1115 1120 1125 

Ser Asn Arg Asp Val Tyr Arg val Gly Ser Leu Ser Asp Leu Thr 
1130 1135 1140 

15 

Tyr Lys He Lys Gin Ser Gin Thr Glu Pro Leu Arg Ala Gly Val . 
1145 1150 1155 

Met Thr Ser Phe Asn Asn Leu Lys Glu Val Gin Asp Gly Leu Leu 
20 1160 1165 1170 

Asn Val Leu Ser Phe Phe Pro Ala Tyr His His Gin Asp Phe Thr 
1175 HBO 1185 

25 Gin Arg His Gly Gin Asp Ser; Ala Met Pro Asn val Leu Asn He 
1190 1195 1200 

Ala He Arg Ala Phe Glu Glu. Lys Asp Asp Met Ser Asp Leu Asp 
1205 1^10 1215 

30 

Trp Ala . Lys Ser Val Glu Ser Leu Val Met Gin Met Ser Ala Glu 
1220 1225 1230 

35 He Gin Lys Lys Gly He Arg Arg Val Thr Phe Leu Val Cys Arg 
1235 1240 1245 

Lys Gly val Tyr Pro Ser Tyr Phe Thr Phe Arg Gin Glu Gly Ala 
1250 1255 1260 

40 

Gin Gly Pro Trp Arg Glu Glu Glu Lys He Arg Asn He Glu Pro 
1265 1270 1275 
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Ala Leu AXa Ser Qln Leu Glu Leu Abu Arg Leu Ser Asn Phe Lys 
1280 12BS 1290 

Val Thr Pro lie Phe Val Asp Aan Arg Gin He Hia n e Tyr Lys 
S 1295 1300 1305 

Gly Val Gly Lys Glu Asn Ser Ser Asp Val Arg Phe phe lie Arg 
1310 1315 1320 

10 Ala Leu val Arg Pro Gly Arg val Gin Gly Ser Met Lys Ala Ala 
1325 1330 1335 

Glu Tyr Leu lie Ser Glu Cys Asp Arg Leu Leu Thr Aep lie Leu 
1340 1345 1350 

15 

Aap Ala Leu Glu Val Val Gly Ala Glu Thr Arg Asn Ala Asp Cys 
1355 1360 13S5 

Asn Hie val Gly He Asn Phe lie Tyr Asn Val Leu Val Asp Phe 
20 1370 1375 1380 

Asp Asp Val Gin Glu Ala Leu Ala Gly Phe He Glu Arg His Gly 
1385 1390 1395 

25 Lys Arg Leu Trp Arg Leu Arg Val Thr Ala Ser Glu He Arg Met 
1400 1405 1410 

Val Leu Glu Asp Asp Glu Gly Asn Val Thr Pro lie Arg Cys Cys 
1415 1420 1425 

30 

He Glu Asn val Ser Gly Phe val Val Lys Tyr His Ala Tyr Gin 
1430 1435 1440 

35 Glu Val Glu Thr Glu Lys Gly Thr Thr He Leu Lys Ser He Gly 
1445 1450 1455 

Asp Leu Gly Pro Leu His Leu Gin Pro Val Asn His Ala Tyr Gin 
1460 1465 1470 

40 

Thr Lys Asn Ser Leu Gin Pro Arg Arg Tyr Gin Ala His Leu Val 
1475 1480 1405 
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Gly Tfar Thr Tyr Val Tyr Asp Tyr Pro Asp Leu Ph© Val Gin Ser 
1490 14&5 1500 

Leu Arg Lys Val Trp Ala Glu Ala Ala Ala Lys lie Pro His Leu 
5 1505 1510 1515 

Arg Val Pro Ser Glu Pro Leu Thr Ala Thr Glu Leu Val Leu Asp 
1520 1525 1530 

10 Glu Asn Asn Glu Leu Gin Glu Val Glu Arg Pro Pro Gly Ser Asn 
1535 1540 1545 

Ser Cys Gly Met; Val Ala Trp lie Phe Thr Met Leu Thr Pro Glu 
1550 1555 1560 

15 

Tyr Pro Lys Gly Arg Arg Val Val Ala lie Ala Asn Asp He Thr 
1565 1570 1575 

Phe lys He Gly Ser Phe Gly Pro Lys Glu Asp Asp Tyr Phe Phe 
20 1580 1585 1530 

Lys Ala Thr Glu He Ala Lys Lys Leu Gly Leu pro Arg He Tyr 
1595 1600 1605 

25 Leu Ser Ala Asn Ser Gly Ala Arg Leu Gly He Ala Glu Glu Leu 
1610 1615 1620 

Leu His He Phe Lys Ala Ala Phe Val Asp Pro Ala Lys Pro Ser 
1625 1630 1635 

30 

Met Gly He Lys Tyr Leu Tyr Leu Thr Pro Glu Thr Leu Ser Thr 
. 1640 1645 1650 

35 Leu Ala Lys Lys Gly Ser Ser Val Thr Thr Glu Glu He Glu Asp 
1655 1660 1665 

Asp Gly Glu Arg Arg His Lys He Thr Ala He He Gly Leu Ala 
1670 1675 1680 

40 

Glu Gly Leu Gly Val Glu Ser Leu Arg Gly Ser Gly Leu He Ala 
1685 1690 1695 
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Qly Ala Thr Thr Arg Ala Tyr Glu Glu Oly He Phe Thr 11© Ser 
1700 1705 1710 

Leu Val Thr Ala Arg Ser Val Gly He Gly Ala Tyr Leu Val Arg 
5 1715 1720 1725 

Leu Gly Gin Arg Ala He Gin Val Olu Gly Asn Pro Met He Leu 
1730 1735 1740 

10 Thr Gly Ala Gin ser Leu Asn Lys Val Leu Gly Arg Glu Val Tyr 
1745 1750 1755 

Thr Ser Asn Leu Gin Leu Gly Gly Thr Gin He Met Ala Arg Asn 
1760 1765 1770 

15 

Gly Thr Thr His Leu Val Ala Glu Ser Asp Leu Asp Gly Ala Leu 
1775 1780 1785 

Lys Val lie Gin Trp Leu Ser Tyr val Pro Glu Arg Lys Gly Lys 
20 1790 1795 1800 

Ala He Pro He Trp Pro Ser Glu Asp Pro Trp Asp Arg Thr Val 
1805 1810 1815 

25 Thr Tyr Glu Pro Pro Arg Gly Pro Tyr Asp Pro Arg Trp Leu Leu 
1820 1825 1830 

Glu Gly Lys Pro Asp Glu Gly Leu Thr Gly Leu Phe Asp Lys Gly 
1835 1840 1845 

30 

Ser Phe Met Glu Thr Leu Gly Asp Trp Ala Lys Thr He Val Thr 
1850 1855 i860 

35 Gly Arg Ala Arg Leu Gly Gly He Pro Met Gly Val He Ala Val 
1865 1B70 1875 

Glu Thr Arg Thr Thr Glu Lys He He Ala Ala Asp Pro Ala Asn 
1880 1885 1890 



Pro Ala Ala Phe Glu Gin Lys He Met Glu Ala Gly Gin val Trp 
1895 1900 1905 
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Asn Pro Asn Ala Ala Tyr Itfs Thr Ala Gin Ser He Phe Asp lie 
19i0 1915 1920 

Asn Lys Glu Gly Leu Pro Leu Met lie Leu Ala Asn He Arg Gly 
5 1925 . 1930 1935 

Phe Ser Gly Gly Gin Gly Asp Met Phe Asp Ala He Leu Lys Gin 
1940 1945 1950 

10 Gly Ser Lys He Val Asp Gly Leu Ser Asn Phe Lys Gin Pro Val 
19S5 I960 1965 

Phe Val Tyr* Val Val Pro Asn Gly Glu Lou Arg Gly Gly Ala Trp 
1970 1975 1980 

15 

Val Val Leu Asp Pro Thr He Asn Leu Ala Lys Met Glu Met Tyr 
1985 1990 1995 

Ala Asp Glu Thr Ala Arg Gly Gly He Leu Glu Pro Glu Gly He 
20 2000 2005 2010 

Val Glu He Lys Phe Arg Arg Asp Lys Val He Ala Thr Met Glu 
2015 2020 2025 

25 Arg Leu Asp Glu Thr Tyr Ala Ser Leu Lys Ala Ala Ser Asn Asp 
2030 2035 2040 

Ser Thr Lys Ser Ala Glu Glu : Arg Ala Lys Ser Ala Glu Leu Leu 
2045 2050 2055 

30 

Lys Ala Arg Glu Thr Leu Leu Gin Pro Thr Tyr Leu Gin He Ala 
2060 2065 2070 

35 His Leu Tyr Ala Asp Leu His Asp Arg Val Gly Arg Met Glu Ala 
2075 2080 2085 

Lys Gly Cys Ala Lys Arg Ala Val Trp Ala Glu Ala Arg Arg Phe 
2090 2095 2100 

40 

Phe Tyr Trp Arg Leu Arg Arg Arg Leu Asn Asp Glu His He 'Leu 
2105 2110 2115 
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Ser Lye Phe Ala Ala Ala Asn Pro Asp Leu Thr Leu Glu Glu Arg 
2120 2125 2130 

Gin Asn He Leu Asp Ser Val Val Gin Thr Asp Leu Thr Asp Asp 
5 2135 2140 2145 

Arg Ala Thr Ala Glu 1»rp He Glu Gin Ser Ala Glu Glu lie Ala 
2150 2155 2160 

10 Ala Ala val Ala Glu Val Arg Ser Thr Tyr Val Ser Asa Lys He 
2165 2170 2175 

lie Ser Phe Ala Glu Thr Glu Arg Ala Gly Ala Leu Gin Gly Leu 
2180 2185 2190 

IS 

Val Ala Val Leu Ser Thr Leu Asn Ala Glu Asp Lys Lys Ala Leu 
2195 2200 2205 

Val Ser Ser Leu Gly Leu 
20 2210 



<210> 


4 


<2ii> 


26 


«212> 


DMA 


<213> 


Artificial 


<220> 




<22l> 


misc_ feature 


<222> 


(6).. (6) 


<223> 


n Is a, c, g 


<220> 




<221> 


misc_feature 


<222> 


(9)..<9) 


<223> 


a is a, c, g 


<220> 




<221> 


misc_£eature 


<222> 


(15) .. (15) 


<223> 


n is a, c, g 


<220> 




<221> 


miac_feature 


-=222> 


(IB) ..(18) 


<223> 


n is a/ c, g 


<220> 




<221> 


miec_feature 
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<222> (21). -(21) 

<223> n. is a, c, g or t 

<220> 

<221> misg_feature 

5 <222> (24).. (24) 

<223> n is a, c, g or z 



<400> 4 

10 athggngcnt ayytngynmg nytngg 26 





<210> 


5 








<211> 


25 






15 


<212:> 










<213> 


Artificial 








<220> 










<221> 


miac_feature 








<222> 


(3) - • (3) 






20 


<223> 


n is a, c, g 


or 


t 




<220> 










<221> 


xnisc__f eature 








<222> 


(6) . . (o) 








<223> 


ii is a ; c, g 


or 


*- 

u 


25 


<220> 










<221> 


mis c__f eature 








<222> 


(12) . . (12) 








<223> 


n is a, e, g 


or 


t 




<220> 








30 


<221> 


mise_£eature 








<222> 


(15).. (15) 








<223> 


n is a/ c, g 


or 


t 




<220> 










<221> 


misc^f eature 






35 


<222> 


(18).. (18) 








<223> 


n is a, o, g 


or 


t 




<220> 










<22l> 


misc^f eature 








<222> 


(21) ..(21) 






40 


<223> 


n is a, c, g 


or 


t 




<220> 










<221> 


miscL-feature 








<222> 


(24) . . (24) 








<223> 


a is a, c, g 


or 


t 
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<400> S 

acnacnaccc angcnccncc nckna 25 



5 <210> 6 

<21l> 26 

<212> DNA 

<213> Artificial 

10 

<400> 6 

ttaccctcgt cgtcctcaag aaccat 



15 <210> 7 

<211> 26 

<2i2> dna 

<213> Artificial 

20 

<400> 7 

tggatcctac tatcaacctg ccaaga 26 



25 <210> 8 

<211> 26 

<212> DNA 

<213> Artificial 



30 



<400> 8 

gtgaacactg tcttgagagt gtcctt 26 



35 <210> 9 

<2ll> 20 

<212> DMA 

<213> Artificial 



40 

<400> 9 

ccgctgctca gcttcagatt 



20 



<210> 10 

<211> 19 

<212> DMA. 

<213> Artificial 

5 

<400> 10 

gattagatag ggatetagt 
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19 



10 



